Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications

Authors:
Haowen Xu

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Wenxiao Chen

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Nengwen Zhao

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Zeyan Li

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Jiahao Bu

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Zhihan Li

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Ying Liu

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Youjian Zhao

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Dan Pei

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Yang Feng

Alibaba Group, Hangzhou, China

Alibaba Group, Hangzhou, China
View Profile

,
Jie Chen

Alibaba Group, Hangzhou, China

Alibaba Group, Hangzhou, China
View Profile

,
Zhaogang Wang

Alibaba Group, Hangzhou, China

Alibaba Group, Hangzhou, China
View Profile

,
Honglin Qiao

Alibaba Group, Hangzhou, China

Alibaba Group, Hangzhou, China
View Profile

WWW '18: Proceedings of the 2018 World Wide Web ConferenceApril 2018Pages 187–196https://doi.org/10.1145/3178876.3185996

Published:23 April 2018Publication History

WWW '18: Proceedings of the 2018 World Wide Web Conference

Pages 187–196

ABSTRACT

To ensure undisrupted business, large Internet companies need to closely monitor various KPIs (e.g., Page Views, number of online users, and number of orders) of its Web applications, to accurately detect anomalies and trigger timely troubleshooting/mitigation. However, anomaly detection for these seasonal KPIs with various patterns and data quality has been a great challenge, especially without labels. In this paper, we proposed Donut, an unsupervised anomaly detection algorithm based on VAE. Thanks to a few of our key techniques, Donut greatly outperforms a state-of-arts supervised ensemble approach and a baseline VAE approach, and its best F-scores range from 0.75 to 0.9 for the studied KPIs from a top global Internet company. We come up with a novel KDE interpretation of reconstruction for Donut, making it the first VAE-based anomaly detection algorithm with solid theoretical explanation.

References

Mennatallah Amer, Markus Goldstein, and Slim Abdennadher. 2013. Enhancing one-class support vector machines for unsupervised anomaly detection Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description. ACM, 8--15. Google ScholarDigital Library
Jinwon An and Sungzoon Cho. 2015. Variational Autoencoder based Anomaly Detection using Reconstruction Probability. Technical Report. SNU Data Mining Center. 1--18 pages.Google Scholar
Matthew James Beal. 2003. Variational algorithms for approximate Bayesian inference. University of London London.Google Scholar
Christopher M Bishop. 2006. Pattern recognition and machine learning. springer. Google ScholarDigital Library
Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM computing surveys (CSUR) Vol. 41, 3 (2009), 15. Google ScholarDigital Library
Yingying Chen, Ratul Mahajan, Baskar Sridharan, and Zhi-Li Zhang. 2013. A Provider-side View of Web Search Response Time. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM (SIGCOMM '13). ACM, New York, NY, USA, 243--254. Google ScholarDigital Library
Sarah M Erfani, Sutharshan Rajasegarar, Shanika Karunasekera, and Christopher Leckie. 2016. High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition Vol. 58 (2016), 121--134. Google ScholarDigital Library
Romain Fontugne, Pierre Borgnat, Patrice Abry, and Kensuke Fukuda. 2010. MAWILab: Combining Diverse Anomaly Detectors for Automated Anomaly Labeling and Performance Benchmarking. In Proceedings of the 6th International COnference (Co-NEXT '10). ACM, Article 8, 12 pages. Google ScholarDigital Library
Zhouyu Fu, Weiming Hu, and Tieniu Tan. 2005. Similarity based vehicle trajectory clustering and anomaly detection Image Processing, 2005. ICIP 2005. IEEE International Conference on, Vol. Vol. 2. IEEE, II--602.Google Scholar
John Geweke. 1989. Bayesian inference in econometric models using Monte Carlo integration. Econometrica: Journal of the Econometric Society (1989), 1317--1339.Google Scholar
Francisco J Goerlich Gisbert. 2003. Weighted samples, kernel density estimators and convergence. Empirical Economics Vol. 28, 2 (2003), 335--351.Google ScholarCross Ref
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. Google ScholarDigital Library
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680. Google ScholarDigital Library
Wolfgang H"ardle, Axel Werwatz, Marlene Müller, and Stefan Sperlich. 2004. Nonparametric density estimation. Nonparametric and Semiparametric Models (2004), 39--83.Google Scholar
Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
Diederik P Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In Proceedings of the International Conference on Learning Representations.Google Scholar
Florian Knorn and Douglas J Leith. 2008. Adaptive kalman filtering for anomaly detection in software appliances INFOCOM Workshops 2008, IEEE. IEEE, 1--6.Google Scholar
Balachander Krishnamurthy, Subhabrata Sen, Yin Zhang, and Yan Chen. 2003. Sketch-based change detection: methods, evaluation, and applications Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement. ACM, 234--247. Google ScholarDigital Library
Nikolay Laptev, Saeed Amizadeh, and Ian Flint. 2015. Generic and scalable framework for automated time-series anomaly detection Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1939--1947. Google ScholarDigital Library
Alexander Lavin and Subutai Ahmad. 2015. Evaluating Real-Time Anomaly Detection Algorithms--The Numenta Anomaly Benchmark. In Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on. IEEE, 38--44.Google ScholarCross Ref
Rikard Laxhammar, Goran Falkman, and Egils Sviestins. 2009. Anomaly detection in sea traffic-a comparison of the gaussian mixture model and the kernel density estimator. In Information Fusion, 2009. FUSION'09. 12th International Conference on. IEEE, 756--763.Google Scholar
Suk-Bok Lee, Dan Pei, MohammadTaghi Hajiaghayi, Ioannis Pefkianakis, Songwu Lu, He Yan, Zihui Ge, Jennifer Yates, and Mario Kosseifi. 2012. Threshold compression for 3g scalable monitoring. In INFOCOM, 2012 Proceedings IEEE. IEEE, 1350--1358.Google ScholarCross Ref
Dapeng Liu, Youjian Zhao, Haowen Xu, Yongqian Sun, Dan Pei, Jiao Luo, Xiaowei Jing, and Mei Feng. 2015. Opprentice: Towards Practical and Automatic Anomaly Detection Through Machine Learning Proceedings of the 2015 ACM Conference on Internet Measurement Conference (IMC '15). ACM, New York, NY, USA, 211-224. Google ScholarDigital Library
Wei Lu and Ali A Ghorbani. 2009. Network anomaly detection based on wavelet analysis. EURASIP Journal on Advances in Signal Processing Vol. 2009 (2009), 4.Google Scholar
Ajay Mahimkar, Zihui Ge, Jia Wang, Jennifer Yates, Yin Zhang, Joanne Emmons, Brian Huntley, and Mark Stockert. 2011. Rapid detection of maintenance induced changes in service performance Proceedings of the Seventh COnference on emerging Networking EXperiments and Technologies. ACM, 13. Google ScholarDigital Library
Gerhard Münz, Sa Li, and Georg Carle. 2007. Traffic anomaly detection using k-means clustering GI/ITG Workshop MMBnet.Google Scholar
Miguel Nicolau, James McDermott, et almbox.. 2016. One-Class Classification for Anomaly Detection with Kernel Density Estimation and Genetic Programming. In European Conference on Genetic Programming. Springer, 3--18.Google Scholar
Thomas Dyhre Nielsen and Finn Verner Jensen. 2009. Bayesian networks and decision graphs. Springer Science & Business Media.Google Scholar
Brandon Pincombe. 2005. Anomaly detection in time series of graphs using arma processes. Asor Bulletin Vol. 24, 4 (2005), 2.Google Scholar
Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML'14). JMLR.org, Beijing, China, II--1278--II--1286. Google ScholarDigital Library
Terrence J Sejnowski and Charles R Rosenberg. 1987. Parallel networks that learn to pronounce English text. Complex systems Vol. 1, 1 (1987), 145--168.Google Scholar
Shashank Shanbhag and Tilman Wolf. 2009. Accurate anomaly detection through parallelism. Network, IEEE Vol. 23, 1 (2009), 22--28. Google ScholarDigital Library
Maximilian Sölch, Justin Bayer, Marvin Ludersdorfer, and Patrick van der Smagt. 2016. Variational inference for on-line anomaly detection in high-dimensional time series. International Conference on Machine Laerning Anomaly detection Workshop (2016).Google Scholar
Jonathan AC Sterne, Ian R White, John B Carlin, Michael Spratt, Patrick Royston, Michael G Kenward, Angela M Wood, and James R Carpenter. 2009. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. Bmj Vol. 338 (2009), b2393.Google ScholarCross Ref
Hao Wang and Dit-Yan Yeung. 2016. Towards Bayesian deep learning: A survey. arXiv preprint arXiv:1604.01662 (2016).Google Scholar
H. Xu, W. Chen, N. Zhao, Z. Li, J. Bu, Z. Li, Y. Liu, Y. Zhao, D. Pei, Y. Feng, J. Chen, Z. Wang, and H. Qiao. 2018. Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications. ArXiv e-prints (Feb.. 2018). {arxiv} cs.LG/1802.03903 Google ScholarDigital Library
Asrul H Yaacob, Ian KT Tan, Su Fong Chien, and Hon Khi Tan. 2010. Arima based network anomaly detection. In Communication Software and Networks, 2010. ICCSN'10. Second International Conference on. IEEE, 205--209. Google ScholarDigital Library
He Yan, Ashley Flavel, Zihui Ge, Alexandre Gerber, Dan Massey, Christos Papadopoulos, Hiren Shah, and Jennifer Yates. 2012. Argus: End-to-end service anomaly detection and localization from an ISP's point of view. In INFOCOM, 2012 Proceedings IEEE. IEEE, 2756--2760.Google ScholarCross Ref

Index Terms

Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Anomaly detection
2. Information systems
  1. World Wide Web
    1. Web mining
      1. Traffic analysis

Recommendations

Event log anomaly detection method based on auto-encoder and control flow
Abstract
Anomaly detection is widely used in the field of business process management, and researchers have proposed various anomaly detection algorithms to detect anomalies in event logs. However, existing research focuses on detecting anomalies in event ...
Read More
Autoencoding Binary Classifiers for Supervised Anomaly Detection
PRICAI 2019: Trends in Artificial Intelligence
Abstract
We propose the Autoencoding Binary Classifiers (ABC), a novel supervised anomaly detector based on the Autoencoder (AE). There are two main approaches in anomaly detection: supervised and unsupervised. The supervised approach accurately detects ...
Read More
Seasonal ARMA-based SPC charts for anomaly detection

Monitoring complex production systems is primordial to ensure management, reliability and safety as well as maintaining the desired product quality. Early detection of emergent abnormal behaviour in monitored systems allows pre-emptive action to prevent ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '18: Proceedings of the 2018 World Wide Web Conference
April 2018
2000 pages
ISBN:9781450356398
General Chairs:
Pierre-Antoine Champin
Universitè Claude Bernard Lyon 1, France
,
Fabien Gandon
Inria, Université Côte d'Azur, CNRS, I3S, France
,
Lionel Médini
Université Claude Bernard Lyon 1, France
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Panagiotis G. Ipeirotis
New York University, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
International World Wide Web Conferences Steering Committee
Republic and Canton of Geneva, Switzerland
Publication History
- Published: 23 April 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anomaly detection
seasonal KPI
variational auto-encoder
Qualifiers
- research-article
Conference

Acceptance Rates
WWW '18 Paper Acceptance Rate170of1,155submissions,15%Overall Acceptance Rate1,899of8,196submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 447
  Total Citations
  View Citations
- 10,206
  Total Downloads
- Downloads (Last 12 months)2,056
- Downloads (Last 6 weeks)275
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications

WWW '18: Proceedings of the 2018 World Wide Web Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Event log anomaly detection method based on auto-encoder and control flow

Autoencoding Binary Classifiers for Supervised Anomaly Detection

Seasonal ARMA-based SPC charts for anomaly detection