ABSTRACT
Detecting bursts in data streams is an important and challenging task. Due to the complexity of this task, usually burst detection cannot be formulated using standard query operators. Therefore, we show how to integrate burst detection for stationary as well as non-stationary data into query formulation and processing, from the language level to the operator level. Afterwards, we present fundamentals of threshold-based burst detection. We focus on the applicability of time series forecasting techniques in order to dynamically identify suitable thresholds for stream data containing arbitrary trends and periods. The proposed approach is evaluated with respect to quality and performance on synthetic and real-world sensor data using a full-fledged DSMS.
- Arasu, A., Babu, S., and Widom, J. The CQL Continuous Query Language: Semantic Foundations and Query Execution. Tech. rep., University of Stanford, 2003.Google Scholar
- Chatfield, C., and Yar, M. Holt-Winters Forecasting: Some Practical Issues. The Statistician, Special Issue: Statistical Forecasting and Decision-Making 37, 2 (1988), 129--140.Google Scholar
- Ergün, F., Muthukrishnan, S., and Sahinalp, S. C. Sublinear methods for detecting periodic trends in data streams. In LATIN (2004).Google Scholar
- Hinneburg, A., Habich, D., and Karnstedt, M. Analyzing Data Streams by Online DFT. In IWKDDS-2006 (2006), pp. 67--76.Google Scholar
- Keogh, E., Lonardi, S., and chi' Chiu, B. Y. Finding surprising patterns in a time series database in linear time and space. In KDD'02 (2002), pp. 550--556. Google ScholarDigital Library
- Kleinberg, J. M. Bursty and hierarchical structure in streams. Data Min. Knowl. Discov. 7, 4 (2003), 373--397. Google ScholarDigital Library
- Papadimitriou, S., Brockwell, A., and Faloutsos, C. AWSOM: Adaptive, Hands-Off Stream Mining. In VLDB 2003 (2003), pp. 560--571. Google ScholarDigital Library
- Shahabi, C., Tian, X., and Zhao, W. Tsa-tree: A wavelet-based approach to improve the efficiency of multi-level surprise and trend queries on time-series data. In SSDBM'00 (2000), p. 55. Google ScholarDigital Library
- Shasha, D., and Zhu, Y. High Performance Discovery in Time Series: Techniques and Case Studies. Springer, 2004. Google ScholarDigital Library
- Vlachos, M., Meek, C., Vagena, Z., and Gunopulos, D. Identifying Similarities, Periodicities and Bursts for Online Search Queries. In SIGMOD 2004 (2004), pp. 131--142. Google ScholarDigital Library
- Wang, M., Madhyastha, T. M., Chan, N. H., Papadimitriou, S., and Faloutsos, C. Data mining meets performance evaluation: Fast algorithms for modeling bursty traffic. In ICDE (2002).Google Scholar
- Yamanishi, K., and Takeuchi, J.-I. A Unifying Framework for Detecting Outliers and Change Points from Non-stationary Time Series Data. In SIGKDD 2002 (2002), pp. 676--681. Google ScholarDigital Library
- Zhang, X., and Shasha, D. Better burst detection. In ICDE'06 (2006), pp. 146--149. Google ScholarDigital Library
- Zhu, Y., and Shasha, D. Efficient Elastic Burst Detection in Data Streams. In SIGKDD 2003, Washington, D.C., USA (2003), pp. 336--345. Google ScholarDigital Library
Recommendations
Reservoir-based network traffic stream summarization for anomaly detection
Summarization is an important intermediate step for expediting knowledge discovery tasks such as anomaly detection. In the context of anomaly detection from data stream, the summary needs to represent both anomalous and normal data. But streaming data ...
Stream engine: a new kernel interface for high-performance internet streaming servers
Web content caching and distributionAs high-speed Internet connections and Internet streaming media become widespread, the demand for high-performance, cheap Internet streaming servers increases. In this paper, we look into the performance limitations of streaming server applications ...
Integrating a stream processing engine and databases for persistent streaming data management
DEXA'07: Proceedings of the 18th international conference on Database and Expert Systems ApplicationsBecause of increased stream data, managing stream data has become quite important. This paper describes our data stream management system, which employs an architecture combining a stream processing engine and DBMS. Based on the architecture, the system ...
Comments