|
ABSTRACT
Routers have the ability to output statistics about packets and flows of packets that traverse them. Since, however, the generation of detailed traffic statistics does not scale well with link speed, increasingly routers and measurement boxes implement sampling strategies at the packet level. In this paper, we study both theoretically and practically what information about the original traffic can be inferred when sampling, or "thinning", is performed at the packet level. While basic packet level characteristics such as first order statistics can be fairly directly recovered, other aspects require more attention. We focus mainly on the spectral density, a second-order statistic, and the distribution of the number of packets per flow, showing how both can be exactly recovered, in theory. We then show in detail why in practice this cannot be done using the traditional packet based sampling, even for high sampling rate.We introduce an alternative flow-based thinning, where practical inversion is possible even at arbitrarily low sampling rate. We also investigate the theory and practice of fitting the parameters of a Poisson cluster process, modeling the full packet traffic, from sampled data.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
[2] Cisco Netflow [Online]. Available: http://www.cisco.com
|
| |
3
|
[3] sFlow Accuracy and Billing, Inmon Corp. [Online]. Available: http://www.inmon.com/PDF/sFlowBilling.pdf
|
| |
4
|
[4] Cisco Sampled NetFlow [Online]. Available: http://www.cisco.com
|
 |
5
|
Gianluca Iannaccone , Christophe Diot , Ian Graham , Nick McKeown, Monitoring very high speed links, Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement, November 01-02, 2001, San Francisco, California, USA
[doi> 10.1145/505202.505235]
|
| |
6
|
[6] K. C. Claffy, H.-W. Braun, and G. C. Polyzos, "Parameterizable methodology for Internet traffic flow profiling," IEEE J. Select. Areas Commun., vol. 13, no. 8, pp. 1481-1494, Oct. 1995.
|
| |
7
|
[7] B. Ryu, D. Cheney, and H. Braun, "Internet flow characterization: adaptive timeout strategy and statistical modeling," in Proc. Passive and Active Measurement Workshop, 2001, pp. 94-105.
|
 |
8
|
Kimberly C. Claffy , George C. Polyzos , Hans-Werner Braun, Application of sampling methodologies to network traffic characterization, Conference proceedings on Communications architectures, protocols and applications, p.194-203, September 13-17, 1993, San Francisco, California, United States
|
| |
9
|
|
| |
10
|
[10] B.-Y. Choi, J. Park, and Z.-L. Zhang, "Adaptive random sampling for total load estimation," in Proc. IEEE Int. Conf. Communications, 2003, pp. 1552-1556.
|
| |
11
|
[11] G. Cheng and J. Gong, "Traffic behavior analysis with Poisson sampling on high-speed network," in Proc. ICII, 2001, pp. 158-163.
|
| |
12
|
[12] Y. Huang and J. M. Pullen, "Countering denial-of-service attacks using congestion triggered packet sampling and filtering," in Proc. Int. Conf. Comput. Commun. Netw., 2001, pp. 490-494.
|
| |
13
|
[13] N. Duffield, C. Lund, and M. Thorup, "Learn more, sample less: Control of volume and variance in network measurement," IEEE Trans. Inf. Theory, vol. 51, no. 5, pp. 1756-1775, May 2005, submitted for publication.
|
| |
14
|
[14] W. Cochran, Sampling Techniques. New York: Wiley, 1987.
|
 |
15
|
|
 |
16
|
Nick Duffield , Carsten Lund , Mikkel Thorup, Estimating flow distributions from sampled flow statistics, Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, August 25-29, 2003, Karlsruhe, Germany
[doi> 10.1145/863955.863992]
|
| |
17
|
[17] P. Brémaud, L. Massoullié, and A. Ridolfi, "Power spectra of random spike fields and related processes," Adv. Appl. Probabil., vol. 37, no. 4, pp. 1116-1146, 2005, submitted for publication.
|
| |
18
|
[18] D. J. Daley and D. Vere-Jones, Introduction to the Theory of Point Processes , 2nd ed. New York: Springer-Verlag, 2002.
|
| |
19
|
[19] N. H. Bingham, C. M. Goldie, and J. L. Teugels, Regular Variation . Cambridge, U.K.: Cambridge Univ. Press, 1987.
|
| |
20
|
[20] J. Riordan, Combinatorial Identities. New York: Wiley, 1968.
|
| |
21
|
|
| |
22
|
[22] J. Daigle, "Queue length distributions from probability generating functions via Fourier transforms," Oper. Res. Lett., no. 8, pp. 229-236, 1989.
|
| |
23
|
[23] M. Roughan, D. Veitch, and M. Rumsewicz, "Computing queue length distributions for power-law queues," in Proc. IEEE INFOCOM, 1998, pp. 356-363.
|
| |
24
|
[24] H. Amindavar and J. Ritchey, "Padé approximations of probability density functions," IEEE Trans. Aerosp. Electron. Syst., vol. 30, no. 2, pp. 416-424, Apr. 1994.
|
| |
25
|
[25] K. Miller, "Stabilized numerical analytic prolongation with poles," SIAM J. Appl. Math., vol. 183, no. 2, pp. 346-363, 1970.
|
| |
26
|
[26] P. A. W. Lewis, "A branching Poisson process model for the analysis of computer failures," J. Roy. Statist. Soc., vol. 26, no. 3, pp. 398-456, 1964.
|
| |
27
|
[27] N. Hohn, D. Veitch, and P. Abry, "The impact of the flow arrival process in Internet traffic," in Proc. IEEE ICASSP, Hong Kong, Apr. 2003, pp. VI 37-VI 40.
|
 |
28
|
|
| |
29
|
[29] Waikato Applied Network Dynamics. [Online]. Available: http://wand.cs.waikato.ac.nz/wand/wits/
|
| |
30
|
[30] National Laboratory for Applied Network Research (NLANR). [On-line]. Available: http://www.nlanr.net/
|
| |
31
|
[31] P. Abry, P. Flandrin, M. S. Taqqu, and D. Veitch, "Wavelets for the analysis, estimation, and synthesis of scaling data," in Self-Similar Network Traffic and Performance Evaluation, K. Park and W. Willinger, Eds. New York: Wiley, 2000, pp. 39-88.
|
 |
32
|
|
| |
33
|
[33] P. Henrici, Applied and Computational Complex Analysis. New York: Wiley, 1974, vol. 1.
|
| |
34
|
|
| |
35
|
[35] N. Hohn, D. Veitch, and P. Abry, "Cluster processes: a natural language for network traffic," IEEE Trans. Signal Process., Special Issue on Signal Processing in Networking, vol. 51, no. 8, pp. 2229-2244, Aug. 2003.
|
| |
36
|
[36] D. R. Cox and V. Isham, Point Processes. London, U.K.: Chapman & Hall, 1980.
|
| |
37
|
[37] Internet Protocol Flow Information eXport (IPFIX), IETF Working Group. [Online]. Available: http://www.ietf.org/html.charters/ipfix-charter.html
|
| |
38
|
[38] Packet Sampling, IETF Working Group. [Online]. Available: http://www.ietf.org/html.charters/psamp-charter.html
|
 |
39
|
|
 |
40
|
Abhishek Kumar , Jun (Jim) Xu , Li Li , Jia Wang, Space-code bloom filter for efficient traffic flow measurement, Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement, October 27-29, 2003, Miami Beach, FL, USA
[doi> 10.1145/948205.948226]
|
 |
41
|
Cristian Estan , Ken Keys , David Moore , George Varghese, Building a better NetFlow, Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications, August 30-September 03, 2004, Portland, Oregon, USA
|
 |
42
|
N. G. Duffield , M. Grossglauser, Trajectory sampling for direct traffic observation, Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, p.271-282, August 28-September 01, 2000, Stockholm, Sweden
|
| |
43
|
|
| |
44
|
[44] A. Feldmann, R. Cáceres, F. Douglis, G. Glass, and M. Rabinovich, "Performance of web proxy caching in heterogeneous bandwidth environments," in Proc. IEEE INFOCOM, vol. 1, 1999, pp. 107-116.
|
| |
45
|
[45] D. A. Drabold and J. L. Jones, "Maximum-entropy approach to series extrapolation and analytic continuation," J. Phys. A: Math. Gen., vol. 24, pp. 4705-4714, 1991.
|
INDEX TERMS
Primary Classification:
C.
Computer Systems Organization
C.2
COMPUTER-COMMUNICATION NETWORKS
C.2.6
Internetworking
Additional Classification:
C.
Computer Systems Organization
C.2
COMPUTER-COMMUNICATION NETWORKS
C.2.3
Network Operations
C.2.5
Local and Wide-Area Networks
C.4
PERFORMANCE OF SYSTEMS
General Terms:
Design,
Management,
Performance,
Theory
Keywords:
Poisson cluster process,
flows,
internet data,
long-range dependence,
sampling,
thinning,
traffic modeling,
transform inversion
|