ABSTRACT
To ensure undisrupted business, large Internet companies need to closely monitor various KPIs (e.g., Page Views, number of online users, and number of orders) of its Web applications, to accurately detect anomalies and trigger timely troubleshooting/mitigation. However, anomaly detection for these seasonal KPIs with various patterns and data quality has been a great challenge, especially without labels. In this paper, we proposed Donut, an unsupervised anomaly detection algorithm based on VAE. Thanks to a few of our key techniques, Donut greatly outperforms a state-of-arts supervised ensemble approach and a baseline VAE approach, and its best F-scores range from 0.75 to 0.9 for the studied KPIs from a top global Internet company. We come up with a novel KDE interpretation of reconstruction for Donut, making it the first VAE-based anomaly detection algorithm with solid theoretical explanation.
- Mennatallah Amer, Markus Goldstein, and Slim Abdennadher. 2013. Enhancing one-class support vector machines for unsupervised anomaly detection Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description. ACM, 8--15. Google ScholarDigital Library
- Jinwon An and Sungzoon Cho. 2015. Variational Autoencoder based Anomaly Detection using Reconstruction Probability. Technical Report. SNU Data Mining Center. 1--18 pages.Google Scholar
- Matthew James Beal. 2003. Variational algorithms for approximate Bayesian inference. University of London London.Google Scholar
- Christopher M Bishop. 2006. Pattern recognition and machine learning. springer. Google ScholarDigital Library
- Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM computing surveys (CSUR) Vol. 41, 3 (2009), 15. Google ScholarDigital Library
- Yingying Chen, Ratul Mahajan, Baskar Sridharan, and Zhi-Li Zhang. 2013. A Provider-side View of Web Search Response Time. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM (SIGCOMM '13). ACM, New York, NY, USA, 243--254. Google ScholarDigital Library
- Sarah M Erfani, Sutharshan Rajasegarar, Shanika Karunasekera, and Christopher Leckie. 2016. High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition Vol. 58 (2016), 121--134. Google ScholarDigital Library
- Romain Fontugne, Pierre Borgnat, Patrice Abry, and Kensuke Fukuda. 2010. MAWILab: Combining Diverse Anomaly Detectors for Automated Anomaly Labeling and Performance Benchmarking. In Proceedings of the 6th International COnference (Co-NEXT '10). ACM, Article 8, 12 pages. Google ScholarDigital Library
- Zhouyu Fu, Weiming Hu, and Tieniu Tan. 2005. Similarity based vehicle trajectory clustering and anomaly detection Image Processing, 2005. ICIP 2005. IEEE International Conference on, Vol. Vol. 2. IEEE, II--602.Google Scholar
- John Geweke. 1989. Bayesian inference in econometric models using Monte Carlo integration. Econometrica: Journal of the Econometric Society (1989), 1317--1339.Google Scholar
- Francisco J Goerlich Gisbert. 2003. Weighted samples, kernel density estimators and convergence. Empirical Economics Vol. 28, 2 (2003), 335--351.Google ScholarCross Ref
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. Google ScholarDigital Library
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680. Google ScholarDigital Library
- Wolfgang H"ardle, Axel Werwatz, Marlene Müller, and Stefan Sperlich. 2004. Nonparametric density estimation. Nonparametric and Semiparametric Models (2004), 39--83.Google Scholar
- Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Diederik P Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In Proceedings of the International Conference on Learning Representations.Google Scholar
- Florian Knorn and Douglas J Leith. 2008. Adaptive kalman filtering for anomaly detection in software appliances INFOCOM Workshops 2008, IEEE. IEEE, 1--6.Google Scholar
- Balachander Krishnamurthy, Subhabrata Sen, Yin Zhang, and Yan Chen. 2003. Sketch-based change detection: methods, evaluation, and applications Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement. ACM, 234--247. Google ScholarDigital Library
- Nikolay Laptev, Saeed Amizadeh, and Ian Flint. 2015. Generic and scalable framework for automated time-series anomaly detection Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1939--1947. Google ScholarDigital Library
- Alexander Lavin and Subutai Ahmad. 2015. Evaluating Real-Time Anomaly Detection Algorithms--The Numenta Anomaly Benchmark. In Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on. IEEE, 38--44.Google ScholarCross Ref
- Rikard Laxhammar, Goran Falkman, and Egils Sviestins. 2009. Anomaly detection in sea traffic-a comparison of the gaussian mixture model and the kernel density estimator. In Information Fusion, 2009. FUSION'09. 12th International Conference on. IEEE, 756--763.Google Scholar
- Suk-Bok Lee, Dan Pei, MohammadTaghi Hajiaghayi, Ioannis Pefkianakis, Songwu Lu, He Yan, Zihui Ge, Jennifer Yates, and Mario Kosseifi. 2012. Threshold compression for 3g scalable monitoring. In INFOCOM, 2012 Proceedings IEEE. IEEE, 1350--1358.Google ScholarCross Ref
- Dapeng Liu, Youjian Zhao, Haowen Xu, Yongqian Sun, Dan Pei, Jiao Luo, Xiaowei Jing, and Mei Feng. 2015. Opprentice: Towards Practical and Automatic Anomaly Detection Through Machine Learning Proceedings of the 2015 ACM Conference on Internet Measurement Conference (IMC '15). ACM, New York, NY, USA, 211-224. Google ScholarDigital Library
- Wei Lu and Ali A Ghorbani. 2009. Network anomaly detection based on wavelet analysis. EURASIP Journal on Advances in Signal Processing Vol. 2009 (2009), 4.Google Scholar
- Ajay Mahimkar, Zihui Ge, Jia Wang, Jennifer Yates, Yin Zhang, Joanne Emmons, Brian Huntley, and Mark Stockert. 2011. Rapid detection of maintenance induced changes in service performance Proceedings of the Seventh COnference on emerging Networking EXperiments and Technologies. ACM, 13. Google ScholarDigital Library
- Gerhard Münz, Sa Li, and Georg Carle. 2007. Traffic anomaly detection using k-means clustering GI/ITG Workshop MMBnet.Google Scholar
- Miguel Nicolau, James McDermott, et almbox.. 2016. One-Class Classification for Anomaly Detection with Kernel Density Estimation and Genetic Programming. In European Conference on Genetic Programming. Springer, 3--18.Google Scholar
- Thomas Dyhre Nielsen and Finn Verner Jensen. 2009. Bayesian networks and decision graphs. Springer Science & Business Media.Google Scholar
- Brandon Pincombe. 2005. Anomaly detection in time series of graphs using arma processes. Asor Bulletin Vol. 24, 4 (2005), 2.Google Scholar
- Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML'14). JMLR.org, Beijing, China, II--1278--II--1286. Google ScholarDigital Library
- Terrence J Sejnowski and Charles R Rosenberg. 1987. Parallel networks that learn to pronounce English text. Complex systems Vol. 1, 1 (1987), 145--168.Google Scholar
- Shashank Shanbhag and Tilman Wolf. 2009. Accurate anomaly detection through parallelism. Network, IEEE Vol. 23, 1 (2009), 22--28. Google ScholarDigital Library
- Maximilian Sölch, Justin Bayer, Marvin Ludersdorfer, and Patrick van der Smagt. 2016. Variational inference for on-line anomaly detection in high-dimensional time series. International Conference on Machine Laerning Anomaly detection Workshop (2016).Google Scholar
- Jonathan AC Sterne, Ian R White, John B Carlin, Michael Spratt, Patrick Royston, Michael G Kenward, Angela M Wood, and James R Carpenter. 2009. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. Bmj Vol. 338 (2009), b2393.Google ScholarCross Ref
- Hao Wang and Dit-Yan Yeung. 2016. Towards Bayesian deep learning: A survey. arXiv preprint arXiv:1604.01662 (2016).Google Scholar
- H. Xu, W. Chen, N. Zhao, Z. Li, J. Bu, Z. Li, Y. Liu, Y. Zhao, D. Pei, Y. Feng, J. Chen, Z. Wang, and H. Qiao. 2018. Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications. ArXiv e-prints (Feb.. 2018). {arxiv} cs.LG/1802.03903 Google ScholarDigital Library
- Asrul H Yaacob, Ian KT Tan, Su Fong Chien, and Hon Khi Tan. 2010. Arima based network anomaly detection. In Communication Software and Networks, 2010. ICCSN'10. Second International Conference on. IEEE, 205--209. Google ScholarDigital Library
- He Yan, Ashley Flavel, Zihui Ge, Alexandre Gerber, Dan Massey, Christos Papadopoulos, Hiren Shah, and Jennifer Yates. 2012. Argus: End-to-end service anomaly detection and localization from an ISP's point of view. In INFOCOM, 2012 Proceedings IEEE. IEEE, 2756--2760.Google ScholarCross Ref
Index Terms
- Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications
Recommendations
Event log anomaly detection method based on auto-encoder and control flow
AbstractAnomaly detection is widely used in the field of business process management, and researchers have proposed various anomaly detection algorithms to detect anomalies in event logs. However, existing research focuses on detecting anomalies in event ...
Autoencoding Binary Classifiers for Supervised Anomaly Detection
PRICAI 2019: Trends in Artificial IntelligenceAbstractWe propose the Autoencoding Binary Classifiers (ABC), a novel supervised anomaly detector based on the Autoencoder (AE). There are two main approaches in anomaly detection: supervised and unsupervised. The supervised approach accurately detects ...
Seasonal ARMA-based SPC charts for anomaly detection
Monitoring complex production systems is primordial to ensure management, reliability and safety as well as maintaining the desired product quality. Early detection of emergent abnormal behaviour in monitored systems allows pre-emptive action to prevent ...
Comments