ABSTRACT
Supervised statistical approaches for the classification of network traffic are quickly moving from research laboratories to advanced prototypes, which in turn will become actual products in the next few years. While the research on the classification algorithms themselves has made quite significant progress in the recent past, few papers have examined the problem of determining the optimum working parameters for statistical classifiers in a straightforward and foolproof way. Without such optimization, it becomes very difficult to put into practice any classification algorithm for network traffic, no matter how advanced it may be. In this paper we present a simple but effective procedure for the optimization of the working parameters of a statistical network traffic classifier. We put the optimization procedure into practice, and examine its effects when the classifier is run in very different scenarios, ranging from medium and large local area networks to Internet backbone links. Experimental results show not only that an automatic optimization procedure like the one presented in this paper is necessary for the classifier to work at its best, but they also shed some light on some of the properties of the classification algorithm that deserve further study.
- J. Erman, A. Mahanti, M. Arlitt, I. Cohen, and C. Williamson. Offline/realtime traffic classification using semi-supervised learning. Perform. Eval., 64(9--12):1194--1213, 2007. Google ScholarDigital Library
- L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, and K. Salamatian. Traffic Classification On The Fly. ACM Computer Communication Review, 36(2):23--26, April 2006. Google ScholarDigital Library
- M. Crotti, M. Dusi, F. Gringoli, and L. Salgarelli. Detecting HTTP Tunnels with Statistical Mechanisms. In Proceedings of the 42th IEEE International Conference on Communications (ICC 2007), pages 6162--6168, Glasgow, Scotland, Jun. 2007.Google ScholarCross Ref
- M. Crotti, M. Dusi, F. Gringoli, and L. Salgarelli. Traffic Classification through Simple Statistical Fingerprinting. ACM SIGCOMM Computer Communication Review, 37(1):5--16, Jan. 2007. Google ScholarDigital Library
- L. Bernaille, R. Teixeira, and K. Salamatian. Early Application Identification. In Proceedings of CoNEXT'06, Lisboa, PT, Dec. 2006. Google ScholarDigital Library
- A. Webb. Statistical Pattern Recognition. Wiley, 2nd edition, 2002. ISBN 0-470-84514-7.Google Scholar
- V. Paxson. Empirically derived analytic models of wide-area TCP connections. IEEE/ACM Transactions on Networking, 2(4):316--336, 1994. Google ScholarDigital Library
- V. Paxson and S. Floyd. Wide area traffic: the failure of Poisson modeling. IEEE/ACM Transactions on Networking, 3(3):226--244, 1995. Google ScholarDigital Library
- M. Dusi, F. Gringoli, and L. Salgarelli. IP Traffic Classification for QoS Guarantees: the Independence of Packets. In Proceedings of The 1st IEEE International Workshop on IP Multimedia Communications (IPMC 2008), St. Thomas, U.S. Virgin Islands, Aug. 2008.Google ScholarCross Ref
- L7 Filter. http://l7-filter.sourceforge.net.Google Scholar
- M. Crotti, F. Gringoli, P. Pelosato, and L. Salgarelli. A statistical approach to IP-level classification of network traffic. In Proceedings of the 41th IEEE International Conference on Communications (ICC 2006), Istanbul, Turkey, Jun. 2006.Google ScholarCross Ref
- J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright. Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions. SIAM Journal of Optimization, 9(1):112--147, 1998. Google ScholarDigital Library
- Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, October 1999. Google ScholarDigital Library
- LBNL/ICSI Enterprise Tracing Project. http://www.icir.org/enterprise-tracing.Google Scholar
- T. Karagiannis, A. Broido, N. Brownlee, K. C. Claffy, and M. Faloutsos. Is P2P dying or just hiding? In Proceedings of the GLOBECOM 2004 Conference, Dallas, Texas, USA, 2004. IEEE Computer Society Press.Google ScholarCross Ref
- The Cooperative Association for Internet Data Analysis (CAIDA). http://www.caida.org.Google Scholar
Index Terms
- Optimizing statistical classifiers of network traffic
Recommendations
Machine Learned Real-Time Traffic Classifiers
IITA '08: Proceedings of the 2008 Second International Symposium on Intelligent Information Technology Application - Volume 03Network traffic classification plays an important role in various network activities. Due to the ineffectiveness of traditional port-based and payload-based methods, recent works proposed using machine learning methods to classify flows based on ...
Improved classification with allocation method and multiple classifiers
We propose a new allocation method for building a classification ensemble.Allocation method uses multiple classifiers: the allocator and micro classifiers.Allocator separates the dataset and allocates them to one of micro classifiers.Allocator is based ...
Online NetFPGA decision tree statistical traffic classifier
Classifying online network traffic is becoming critical in network management and security. Recently, new classification methods based on analysis of statistical features of transport layer traffic have been proposed. While these new methods address the ...
Comments