ABSTRACT
The increasing practicality of large-scale flow capture makes it possible to conceive of traffic analysis methods that detect and identify a large and diverse set of anomalies. However the challenge of effectively analyzing this massive data source for anomaly diagnosis is as yet unmet. We argue that the distributions of packet features (IP addresses and ports) observed in flow traces reveals both the presence and the structure of a wide range of anomalies. Using entropy as a summarization tool, we show that the analysis of feature distributions leads to significant advances on two fronts: (1) it enables highly sensitive detection of a wide range of anomalies, augmenting detections by volume-based methods, and (2) it enables automatic classification of anomalies via unsupervised learning. We show that using feature distributions, anomalies naturally fall into distinct and meaningful clusters. These clusters can be used to automatically classify anomalies and to uncover new anomaly types. We validate our claims on data from two backbone networks (Abilene and Geant) and conclude that feature distributions show promise as a key element of a fairly general network anomaly diagnosis framework.
- Abilene Network Operations Center Weekly Reports. At http://www.abilene.iu.edu/routages.cgi.]]Google Scholar
- Arbor Networks. At http://www.arbornetworks.com/.]]Google Scholar
- P. Barford, J. Kline, D. Plonka, and A. Ron. A signal analysis of network traffic anomalies. In Internet Measurement Workshop, Marseille, November 2002.]] Google ScholarDigital Library
- J. Brutlag. Aberrant behavior detection in timeseries for network monitoring. In USENIX LISA, New Orleans, December 2000.]] Google ScholarDigital Library
- Cisco NetFlow. At www.cisco.com/warp/public/732/Tech/netflow/.]]Google Scholar
- D. Denning. An Intrusion-Detection Model. IEEE Transactions on Software Engineering, February 1987.]] Google ScholarDigital Library
- R. Dunia and S. J. Qin. A subspace approach to multidimensional fault identification and reconstruction. American Institute of Chemical Engineers (AIChE) Journal, pages 1813--1831, 1998.]]Google Scholar
- C. Estan, S. Savage, and G. Varghese. Automatically Inferring Patterns of Resource Consumption in Network Traffic. In ACM SIGCOMM, Karlsruhe, August 2003.]] Google ScholarDigital Library
- L. Feinstein, D. Schnackenberg, R. Balupari, and D. Kindred. Statistical Approaches to DDoS Attack Detection and Response. DARPA Information Survivability Conference and Exposition (DISCEX), pages 303--314, April 2003.]]Google ScholarCross Ref
- A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and F. True. Deriving traffic demands for operational IP networks: Methodology and experience. In IEEE/ACM Transactions on Neworking, pages 265--279, June 2001.]] Google ScholarDigital Library
- A. Hussain, J. Heidemann, and C. Papadopoulos. A Framework for Classifying Denial of Service Attacks. In ACM SIGCOMM, Karlsruhe, August 2003.]] Google ScholarDigital Library
- J. Jung and B. Krishnamurthy and M. Rabinovich. Flash Crowds and Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites. In WWW, Hawaii, May 2002.]] Google ScholarDigital Library
- J. E. Jackson and G. S. Mudholkar. Control procedures for residuals associated with Principal Component Analysis. Technometrics, pages 331--349, 1979.]]Google ScholarCross Ref
- J. Jung, V. Paxson, A. Berger, and H. Balakrishnan. Fast Portscan Detection Using Sequential Hypothesis Testing. In IEEE Symposium on Security and Privacy, May 2004.]]Google Scholar
- Juniper Traffic Sampling. At www.juniper.net/techpubs/software/junos/junos60/swconfig60-policy/html/sampling-overview.html.]]Google Scholar
- H. A. L. Kiers. Towards a standardized notation and terminology in multiway analysis. J. of Chemometrics, pages 105--122, 2000.]]Google Scholar
- H.-A. Kim and B. Karp. Autograph: Toward Automated, Distributed Worm Signature Detection. In Usenix Security Symposium, San Diego, August 2004.]] Google ScholarDigital Library
- M.-S. Kim, H.-J. Kang, S.-C. Hung, S.-H. Chung, and J. W. Hong. A Flow-based Method for Abnormal Network Traffic Detection. In IEEE/IFIP Network Operations and Management Symposium, Seoul, April 2004.]]Google Scholar
- S. Kim and A. L. N. Reddy. A Study of Analyzing Network Traffic as Images in Real-Time. In IEEE INFOCOM, 2005.]]Google Scholar
- S. Kim, A. L. N. Reddy, and M. Vannucci. Detecting Traffic Anomalies through Aggregate Analysis of Packet Header Data. In Networking, 2004.]]Google Scholar
- E. Kohler, J. Li, V. Paxson, and S. Shenker. Observed Structure of Addresses in IP Traffic. In Internet Measurement Workshop, Marseille, November 2002.]] Google ScholarDigital Library
- A. Lakhina, M. Crovella, and C. Diot. Characterization of Network-Wide Anomalies in Traffic Flows (Short Paper). In Internet Measurement Conference, 2004.]] Google ScholarDigital Library
- A. Lakhina, M. Crovella, and C. Diot. Diagnosing Network-Wide Traffic Anomalies. In ACM SIGCOMM, Portland, August 2004.]] Google ScholarDigital Library
- A. Lakhina, M. Crovella, and C. Diot. Mining Anomalies Using Traffic Feature Distributions. Technical Report BUCS-TR-2005-002, Boston University, 2005.]] Google ScholarDigital Library
- A. Lakhina, K. Papagiannaki, M. Crovella, C. Diot, E. D. Kolaczyk, and N. Taft. Structural Analysis of Network Traffic Flows. In ACM SIGMETRICS, New York, June 2004.]] Google ScholarDigital Library
- W. Lee and D. Xiang. Information-Theoretic Measures for Anomaly Detection. In IEEE Symposium on Security and Privacy, Oakland, CA, May 2001.]] Google ScholarDigital Library
- Pathdiag: Network Path Diagnostic Tools. At http://www.psc.edu/~web100/pathdiag/.]]Google Scholar
- J. Pei, S. J. Upadhyaya, F. Farooq, and V. Govindaraju. Data Mining for Intrusion Detection - Techniques, Applications and Systems. In ICDE Tutorial, 2004.]] Google ScholarDigital Library
- Riverhead Networks. At http://www.riverhead.com/.]]Google Scholar
- M. Roughan, T. Griffin, Z. M. Mao, A. Greenberg, and B. Freeman. Combining Routing and Traffic Data for Detection of IP Forwarding Anomalies. In ACM SIGCOMM NeTs Workshop, Portland, August 2004.]] Google ScholarDigital Library
- S. Sarvotham, R. Riedi, and R. Baraniuk. Network Traffic Analysis and Modeling at the Connection Level. In Internet Measurement Workshop, San Francisco, November 2001.]]Google ScholarDigital Library
- S. Schechter, J. Jung, and A. Berger. Fast Detection of Scanning Worm Infections. In Seventh International Symposium on Recent Advances in Intrusion Detection (RAID), Sophia Antipolois, France, September 2004.]]Google Scholar
- SLAC Internet End-to-end Performance Monitoring (IEPM-BW project). At http://www-iepm.slac.stanford.edu/bw/.]]Google Scholar
- M. Thottan and C. Ji. Anomaly Detection in IP Networks. IEEE Trans. Signal Processing (Special issue of Signal Processing in Networking), pages 2191--2204, August 2003.]]Google Scholar
- K. Xu, Z.-L. Zhang, and S. Bhattacharyya. Profiling Internet Backbone Traffic: Behavior Models and Applications. In ACM SIGCOMM, 2005.]] Google ScholarDigital Library
- Y. Zhang, S. Singh, S. Sen, N. Duffield, and C. Lund. Online Identification of Hierarchical Heavy Hitters: Algorithms, Evaluation, and Applications. In Internet Measurement Conference, Taormina, Italy, October 2004.]] Google ScholarDigital Library
Index Terms
- Mining anomalies using traffic feature distributions
Recommendations
Diagnosing network-wide traffic anomalies
SIGCOMM '04: Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communicationsAnomalies are unusual and significant changes in a network's traffic levels, which can often span multiple links. Diagnosing anomalies is critical for both network operators and end users. It is a difficult problem because one must extract and interpret ...
Mining anomalies using traffic feature distributions
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communicationsThe increasing practicality of large-scale flow capture makes it possible to conceive of traffic analysis methods that detect and identify a large and diverse set of anomalies. However the challenge of effectively analyzing this massive data source for ...
Diagnosing network-wide traffic anomalies
Anomalies are unusual and significant changes in a network's traffic levels, which can often span multiple links. Diagnosing anomalies is critical for both network operators and end users. It is a difficult problem because one must extract and interpret ...
Comments