skip to main content
10.1145/3178876.3185996acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Free Access

Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications

Published:23 April 2018Publication History

ABSTRACT

To ensure undisrupted business, large Internet companies need to closely monitor various KPIs (e.g., Page Views, number of online users, and number of orders) of its Web applications, to accurately detect anomalies and trigger timely troubleshooting/mitigation. However, anomaly detection for these seasonal KPIs with various patterns and data quality has been a great challenge, especially without labels. In this paper, we proposed Donut, an unsupervised anomaly detection algorithm based on VAE. Thanks to a few of our key techniques, Donut greatly outperforms a state-of-arts supervised ensemble approach and a baseline VAE approach, and its best F-scores range from 0.75 to 0.9 for the studied KPIs from a top global Internet company. We come up with a novel KDE interpretation of reconstruction for Donut, making it the first VAE-based anomaly detection algorithm with solid theoretical explanation.

References

  1. Mennatallah Amer, Markus Goldstein, and Slim Abdennadher. 2013. Enhancing one-class support vector machines for unsupervised anomaly detection Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description. ACM, 8--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jinwon An and Sungzoon Cho. 2015. Variational Autoencoder based Anomaly Detection using Reconstruction Probability. Technical Report. SNU Data Mining Center. 1--18 pages.Google ScholarGoogle Scholar
  3. Matthew James Beal. 2003. Variational algorithms for approximate Bayesian inference. University of London London.Google ScholarGoogle Scholar
  4. Christopher M Bishop. 2006. Pattern recognition and machine learning. springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM computing surveys (CSUR) Vol. 41, 3 (2009), 15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Yingying Chen, Ratul Mahajan, Baskar Sridharan, and Zhi-Li Zhang. 2013. A Provider-side View of Web Search Response Time. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM (SIGCOMM '13). ACM, New York, NY, USA, 243--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sarah M Erfani, Sutharshan Rajasegarar, Shanika Karunasekera, and Christopher Leckie. 2016. High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition Vol. 58 (2016), 121--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Romain Fontugne, Pierre Borgnat, Patrice Abry, and Kensuke Fukuda. 2010. MAWILab: Combining Diverse Anomaly Detectors for Automated Anomaly Labeling and Performance Benchmarking. In Proceedings of the 6th International COnference (Co-NEXT '10). ACM, Article 8, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Zhouyu Fu, Weiming Hu, and Tieniu Tan. 2005. Similarity based vehicle trajectory clustering and anomaly detection Image Processing, 2005. ICIP 2005. IEEE International Conference on, Vol. Vol. 2. IEEE, II--602.Google ScholarGoogle Scholar
  10. John Geweke. 1989. Bayesian inference in econometric models using Monte Carlo integration. Econometrica: Journal of the Econometric Society (1989), 1317--1339.Google ScholarGoogle Scholar
  11. Francisco J Goerlich Gisbert. 2003. Weighted samples, kernel density estimators and convergence. Empirical Economics Vol. 28, 2 (2003), 335--351.Google ScholarGoogle ScholarCross RefCross Ref
  12. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Wolfgang H"ardle, Axel Werwatz, Marlene Müller, and Stefan Sperlich. 2004. Nonparametric density estimation. Nonparametric and Semiparametric Models (2004), 39--83.Google ScholarGoogle Scholar
  15. Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  16. Diederik P Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  17. Florian Knorn and Douglas J Leith. 2008. Adaptive kalman filtering for anomaly detection in software appliances INFOCOM Workshops 2008, IEEE. IEEE, 1--6.Google ScholarGoogle Scholar
  18. Balachander Krishnamurthy, Subhabrata Sen, Yin Zhang, and Yan Chen. 2003. Sketch-based change detection: methods, evaluation, and applications Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement. ACM, 234--247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Nikolay Laptev, Saeed Amizadeh, and Ian Flint. 2015. Generic and scalable framework for automated time-series anomaly detection Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1939--1947. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Alexander Lavin and Subutai Ahmad. 2015. Evaluating Real-Time Anomaly Detection Algorithms--The Numenta Anomaly Benchmark. In Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on. IEEE, 38--44.Google ScholarGoogle ScholarCross RefCross Ref
  21. Rikard Laxhammar, Goran Falkman, and Egils Sviestins. 2009. Anomaly detection in sea traffic-a comparison of the gaussian mixture model and the kernel density estimator. In Information Fusion, 2009. FUSION'09. 12th International Conference on. IEEE, 756--763.Google ScholarGoogle Scholar
  22. Suk-Bok Lee, Dan Pei, MohammadTaghi Hajiaghayi, Ioannis Pefkianakis, Songwu Lu, He Yan, Zihui Ge, Jennifer Yates, and Mario Kosseifi. 2012. Threshold compression for 3g scalable monitoring. In INFOCOM, 2012 Proceedings IEEE. IEEE, 1350--1358.Google ScholarGoogle ScholarCross RefCross Ref
  23. Dapeng Liu, Youjian Zhao, Haowen Xu, Yongqian Sun, Dan Pei, Jiao Luo, Xiaowei Jing, and Mei Feng. 2015. Opprentice: Towards Practical and Automatic Anomaly Detection Through Machine Learning Proceedings of the 2015 ACM Conference on Internet Measurement Conference (IMC '15). ACM, New York, NY, USA, 211-224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Wei Lu and Ali A Ghorbani. 2009. Network anomaly detection based on wavelet analysis. EURASIP Journal on Advances in Signal Processing Vol. 2009 (2009), 4.Google ScholarGoogle Scholar
  25. Ajay Mahimkar, Zihui Ge, Jia Wang, Jennifer Yates, Yin Zhang, Joanne Emmons, Brian Huntley, and Mark Stockert. 2011. Rapid detection of maintenance induced changes in service performance Proceedings of the Seventh COnference on emerging Networking EXperiments and Technologies. ACM, 13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Gerhard Münz, Sa Li, and Georg Carle. 2007. Traffic anomaly detection using k-means clustering GI/ITG Workshop MMBnet.Google ScholarGoogle Scholar
  27. Miguel Nicolau, James McDermott, et almbox.. 2016. One-Class Classification for Anomaly Detection with Kernel Density Estimation and Genetic Programming. In European Conference on Genetic Programming. Springer, 3--18.Google ScholarGoogle Scholar
  28. Thomas Dyhre Nielsen and Finn Verner Jensen. 2009. Bayesian networks and decision graphs. Springer Science & Business Media.Google ScholarGoogle Scholar
  29. Brandon Pincombe. 2005. Anomaly detection in time series of graphs using arma processes. Asor Bulletin Vol. 24, 4 (2005), 2.Google ScholarGoogle Scholar
  30. Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML'14). JMLR.org, Beijing, China, II--1278--II--1286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Terrence J Sejnowski and Charles R Rosenberg. 1987. Parallel networks that learn to pronounce English text. Complex systems Vol. 1, 1 (1987), 145--168.Google ScholarGoogle Scholar
  32. Shashank Shanbhag and Tilman Wolf. 2009. Accurate anomaly detection through parallelism. Network, IEEE Vol. 23, 1 (2009), 22--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Maximilian Sölch, Justin Bayer, Marvin Ludersdorfer, and Patrick van der Smagt. 2016. Variational inference for on-line anomaly detection in high-dimensional time series. International Conference on Machine Laerning Anomaly detection Workshop (2016).Google ScholarGoogle Scholar
  34. Jonathan AC Sterne, Ian R White, John B Carlin, Michael Spratt, Patrick Royston, Michael G Kenward, Angela M Wood, and James R Carpenter. 2009. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. Bmj Vol. 338 (2009), b2393.Google ScholarGoogle ScholarCross RefCross Ref
  35. Hao Wang and Dit-Yan Yeung. 2016. Towards Bayesian deep learning: A survey. arXiv preprint arXiv:1604.01662 (2016).Google ScholarGoogle Scholar
  36. H. Xu, W. Chen, N. Zhao, Z. Li, J. Bu, Z. Li, Y. Liu, Y. Zhao, D. Pei, Y. Feng, J. Chen, Z. Wang, and H. Qiao. 2018. Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications. ArXiv e-prints (Feb.. 2018). {arxiv} cs.LG/1802.03903 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Asrul H Yaacob, Ian KT Tan, Su Fong Chien, and Hon Khi Tan. 2010. Arima based network anomaly detection. In Communication Software and Networks, 2010. ICCSN'10. Second International Conference on. IEEE, 205--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. He Yan, Ashley Flavel, Zihui Ge, Alexandre Gerber, Dan Massey, Christos Papadopoulos, Hiren Shah, and Jennifer Yates. 2012. Argus: End-to-end service anomaly detection and localization from an ISP's point of view. In INFOCOM, 2012 Proceedings IEEE. IEEE, 2756--2760.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        WWW '18: Proceedings of the 2018 World Wide Web Conference
        April 2018
        2000 pages
        ISBN:9781450356398

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        International World Wide Web Conferences Steering Committee

        Republic and Canton of Geneva, Switzerland

        Publication History

        • Published: 23 April 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        WWW '18 Paper Acceptance Rate170of1,155submissions,15%Overall Acceptance Rate1,899of8,196submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format