skip to main content
10.1145/3167132.3167306acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Applying deep learning on packet flows for botnet detection

Authors Info & Claims
Published:09 April 2018Publication History

ABSTRACT

Botnets constitute a primary threat to Internet security. The ability to accurately distinguish botnet traffic from non-botnet traffic can help significantly in mitigating malicious botnets. We present a novel approach to botnet detection that applies deep learning on flows of TCP/UDP/IP-packets. In our experimental results with a large dataset, we obtained 99.7% accuracy for classifying P2P-botnet traffic. This is comparable to or better than conventional botnet detection approaches, while reducing efforts for feature engineering and feature selection to a minimum.

References

  1. Hadi Asghari, Michel Van Eeten, Johannes M Bauer, and Milton Mueller. 2013. Deep packet inspection: Effects of regulation on its deployment by internet providers. In The 41st Research Conference on Communication, Information and Internet Policy. Arlington, VA.Google ScholarGoogle ScholarCross RefCross Ref
  2. Adam J. Aviv and Andreas Haeberlen. 2011. Challenges in Experimenting with Botnet Detection Systems.. In 4th Workshop on Cyber Security Experimentation and Test, CSET '11, San Francisco, CA, USA, August 8, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Michael Bailey, Evan Cooke, Farnam Jahanian, Yunjing Xu, and Manish Karir. 2009. A survey of botnet technology and defenses. In Conference For Homeland Security, 2009. CATCH'09. Cybersecurity Applications & Technology. IEEE, 299--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Pijush Barthakur, Manoj Dahal, and Mrinal Kanti Ghose. 2013. An Efficient Machine Learning Based Classification Scheme for Detecting Distributed Command & Control Traffic of P2P Botnets. International Journal of Modern Education and Computer Science (IJMECS) 5, 10 (Nov. 2013), 9--18.Google ScholarGoogle Scholar
  5. Gustavo E. A. P. A. Batista, Ronaldo C. Prati, and Maria Carolina Monard. 2004. A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data. {SIGKDD} Explorations 6, 1 (June 2004), 20--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Elaheh Biglar Beigi, Hossein Hadian Jazi, Natalia Stakhanova, Ali Ghorbani, and others. 2014. Towards effective feature selection in machine learning-based botnet detection approaches. In Communications and Network Security (CNS), 2014 IEEE Conference on. IEEE, 247--255.Google ScholarGoogle ScholarCross RefCross Ref
  7. Yoshua Bengio. 2009. Learning Deep Architectures for AI. Foundations and Trends in Machine Learning 2, 1 (2009), 1--127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yoshua Bengio. 2012. Practical Recommendations for Gradient-Based Training of Deep Architectures. In Neural Networks: Tricks of the Trade, Grgoire Montavon, Montavon Grgoire Orr, and Klaus-Robert Mller (Eds.). Lecture Notes in Computer Science, Vol. 7700. Springer Berlin Heidelberg, 437--478.Google ScholarGoogle Scholar
  9. Y. Bengio, P. Simard, and P. Frasconi. 1994. Learning long-term dependencies with gradient descent is difficult. Neural Networks, IEEE Transactions on 5, 2 (March 1994), 157--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Dan Ciresan, Ueli Meier, Luca Maria Gambardella, and Jurgen Schmidhuber. 2010. Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition. Neural Computation 22, 12 (2010), 3207--3220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kimberly C. Claffy, Hans-Werner Braun, and George C. Polyzos. 1995. A parameterizable methodology for Internet traffic flow profiling. Selected areas in Communications, IEEE Journal on 13, 8 (Oct. 1995), 1481--1494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Li Deng. 2014. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing 3 (2014).Google ScholarGoogle Scholar
  13. Ian Goodfellow and Xavier Glorot. 2013. Rectified linear units in autoencoder. (May 2013). https://groups.google.com/forum/#!topic/pylearn-dev/iWqctW9nkAgGoogle ScholarGoogle Scholar
  14. Ian J. Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron C. Courville, and Yoshua Bengio. 2013. Maxout Networks. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16--21 June 2013. 1319--1327. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Fariba Haddadi and A. Nur Zincir-Heywood. 2015. Botnet Detection System Analysis on the Effect of Botnet Evolution and Feature Representation. In Proceedings of the Companion Publication of the 2015 on Genetic and Evolutionary Computation Conference (GECCO Companion '15). ACM, New York, NY, USA, 893--900. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdelrahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N Sainath, and others. 2012. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. Signal Processing Magazine, IEEE 29, 6 (Nov. 2012), 82--97.Google ScholarGoogle Scholar
  17. Geoffrey Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural computation 18, 7 (July 2006), 1527--1554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Geoffrey E Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012).Google ScholarGoogle Scholar
  19. Thorsten Holz, Moritz Steiner, Frederic Dahl, Ernst W Biersack, and Felix Freiling. 2008. Measurements and Mitigation of Peer-to-peer-based Botnets: A Case Study on Storm Worm. In Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats. San Francisco, California. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Nathalie Japkowicz and Shaju Stephen. 2002. The class imbalance problem: A systematic study. Intelligent data analysis 6, 5 (2002), 429--449. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In The International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  22. Alex Krizhevsky and Geoffrey E. Hinton. 2011. Using very deep autoencoders for content-based image retrieval. In ESANN 2011, 19th European Symposium on Artificial Neural Networks, Bruges, Belgium, April 27--29, 2011, Proceedings.Google ScholarGoogle Scholar
  23. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25. Curran Associates, Inc., 1097--1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yann LeCun, Leon Bottou, Genevive B Orr, and Klaus Robert Mller. 1998. Efficient BackProp. In Neural Networks: Tricks of the Trade, Montavon Grgoire Orr and Klaus-Robert Mller (Eds.). Lecture Notes in Computer Science, Vol. 1524. Springer Berlin Heidelberg, 9--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Andrea Lelli. 2012. Zeusbot/spyeye p2p updated, fortifying the botnet. (2012).Google ScholarGoogle Scholar
  26. Andrew L Maas, Awni Y Hannun, and Andrew Y Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In International Conference on Machine Learning (ICML) Workshop on Deep Learning for Audio, Speech, and Language Processing, Vol. 30.Google ScholarGoogle Scholar
  27. Yajie Miao. 2014. Kaldi+ PDNN: building DNN-based ASR systems with Kaldi and PDNN. arXiv preprint arXiv:1401.6984 (2014).Google ScholarGoogle Scholar
  28. Pratik Narang, Subhajit Ray, Chittaranjan Hota, and Venkat Venkatakrishnan. 2014. PeerShark: Detecting Peer-to-Peer Botnets by Tracking Conversations. In 35. IEEE Security and Privacy Workshops, SPW 2014, San Jose, CA, USA, May 17--18, 2014. 108--115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Quamar Niyaz, Weiqing Sun, Ahmad Y Javaid, and Mansoor Alam. 2016. A Deep Learning Approach for Network Intrusion Detection System. In Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (Formerly BIONETICS), BICT, Vol. 15. 21--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Chris Nunnery, Greg Sinclair, and Brent ByungHoon Kang. 2010. Tumbling Down the Rabbit Hole: Exploring the Idiosyncrasies of Botmaster Systems in a Multi-tier Botnet Infrastructure. In Proceedings of the 3rd USENIX Conference on Large-scale Exploits and Emergent Threats: Botnets, Spyware, Worms, and More (LEET'10). USENIX Association, Berkeley, CA, USA, 1--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ruoming Pang, Mark Allman, Vern Paxson, and Jason Lee. 2006. The Devil and Packet Trace Anonymization. ACM Computer Communication Review 36, 1 (Jan. 2006), 29--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Babak Rahbarinia, Roberto Perdisci, Andrea Lanzi, and Kang Li. 2014. PeerRush: Mining for unwanted P2P traffic. Journal of Information Security and Applications 19, 3 (2014), 194 -- 208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Antti Rasmus, Mathias Berglund, Mikko Honkala, Harri Valpola, and Tapani Raiko. 2015. Semi-supervised Learning with Ladder Networks. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc., 3546--3554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. C. Rossow, D. Andriesse, T. Werner, B. Stone-Gross, D. Plohmann, C. J. Dietrich, and H. Bos. 2013. SoK: P2PWNED - Modeling and Evaluating the Resilience of Peer-to-Peer Botnets. In Security and Privacy (SP), 2013 IEEE Symposium on. IEEE, 97--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Christian Rossow and Christian J. Dietrich. 2013. ProVeX: Detecting Botnets with Encrypted Command and Control Channels. In Proceedings of the 10th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA'13). Springer-Verlag, Berlin, Heidelberg, 21--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Sherif Saad, Issa Traore, Ali Ghorbani, Bassam Sayed, David Zhao, Wei Lu, John Felix, and Payman Hakimian. 2011. Detecting P2P botnets through network behavior analysis and machine learning. In Privacy, Security and Trust (PST), 2011 Ninth Annual International Conference on. IEEE, 174--180.Google ScholarGoogle ScholarCross RefCross Ref
  37. Ali Shiravi, Hadi Shiravi, Mahbod Tavallaee, and Ali A. Ghorbani. 2012. Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Computers Security 31, 3 (2012), 357 -- 374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Marina Sokolova and Guy Lapalme. 2009. A systematic analysis of performance measures for classification tasks. Information Processing & Management 45, 4 (2009), 427 -- 437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. R. Sommer and V. Paxson. 2010. Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. In Proceedings Symposium on Security and Privacy. IEEE, 305--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Géza Szabó, Dániel Orincsay, Szabolcs Malomsoky, and István Szabó. 2008. On the Validation of Traffic Classification Algorithms. In Passive and Active Network Measurement, 9th International Conference, PAM 2008, Cleveland, OH, USA, April 29--30, 2008. Proceedings. 72--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. arXiv:1312.6199v4 (2014).Google ScholarGoogle Scholar
  43. Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688 (May 2016).Google ScholarGoogle Scholar
  44. Harri Valpola. 2015. Chapter 8 - From neural PCA to deep unsupervised learning. In Advances in Independent Component Analysis and Learning Machines, Ella Bingham, Samuel Kaski, Jorma Laaksonen, and Jouko Lampinen (Eds.). Academic Press, 143 -- 171.Google ScholarGoogle Scholar
  45. Michel van Eeten, Hadi Asghari, Johannes Bauer, and Shirin Tabatabaie. 2011. Internet Service Providers and Botnet Mitigration. A fact-finding study on the Dutch Market. (2011).Google ScholarGoogle Scholar
  46. Jos van Roosmalen. 2017. The feasibility of deep learning approaches for P2P-botnet detection. Master's thesis. Open University The Netherlands.Google ScholarGoogle Scholar
  47. Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. 2008. Extracting and composing robust features with denoising autoencoders. In Machine Learning, Proceedings of the Twenty-Fifth International Conference (ICML 2008), Helsinki, Finland, June 5--9, 2008. 1096--1103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Zhanyi Wang. 2015. The Applications of Deep Learning on Traffic Identification. In Black Hat USA 2015.Google ScholarGoogle Scholar
  49. DE Rumelhart GE Hinton RJ Williams and GE Hinton. 1986. Learning representations by back-propagating errors. Nature (1986), 323--533.Google ScholarGoogle Scholar
  50. Matthew D. Zeiler and Rob Fergus. 2013. Stochastic Pooling for Regularization of Deep Convolutional Neural Networks. International Conference on Learning Representations 2013 (2013).Google ScholarGoogle Scholar
  51. Junjie Zhang, Roberto Perdisci, Wenke Lee, Xiapu Luo, and Unum Sarfraz. 2014. Building a Scalable System for Stealthy P2P-Botnet Detection. IEEE Transactions on Information Forensics and Security 9, 1 (Jan. 2014), 27--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. David Zhao, Issa Traore, Bassam Sayed, Wei Lu, Sherif Saad, Ali Ghorbani, and Dan Garant. 2013. Botnet detection based on traffic behavior analysis and flow intervals. Computers & Security 39 (Nov. 2013), 2--16. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Applying deep learning on packet flows for botnet detection

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing
          April 2018
          2327 pages
          ISBN:9781450351911
          DOI:10.1145/3167132

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 April 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,650of6,669submissions,25%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader