Abstract
In the model of continuous distributed monitoring, a number of observers each see a stream of observations. Their goal is to work together to compute a function of the union of their observations. This can be as simple as counting the total number of observations, or more complex non-linear functions such as tracking the entropy of the induced distribution. Assuming that it is too costly to simply centralize all the observations, it becomes quite challenging to design solutions which provide a good approximation to the current answer, while bounding the communication cost of the observers, and their other resources such as their space usage. This survey introduces this model, and describe a selection results in this setting, from the simple counting problem to a variety of other functions that have been studied.
- C. Arackaparambil, S. Bratus, J. Brody, and A. Shubina. Distributed monitoring of conditional entropy for anomaly detection in streams. In IPDPS Workshops, 2010.Google ScholarCross Ref
- C. Arackaparambil, J. Brody, and A. Chakrabarti. Functional monitoring without monotonicity. In International Colloquium on Automata, Languages and Programming (ICALP), 2009. Google ScholarDigital Library
- B. Babcock and C. Olston. Distributed top-k monitoring. In ACM SIGMOD International Conference on Management of Data, 2003. Google ScholarDigital Library
- V. Braverman, R. Ostrovsky, and C. Zaniolo. Optimal sampling from sliding windows. In ACM Principles of Database Systems, 2009. Google ScholarDigital Library
- H.-L. Chan, T.-W. Lam, L.-K. Lee, and H.-F. Ting. Continuous monitoring of distributed data streams over a time-based sliding window. In Symposium on Theoretical Aspects of Computer Science (STACS), 2010.Google Scholar
- G. Cormode and M. Garofalakis. Efficient strategies for continuous distributed tracking tasks. IEEE Data Engineering Bulletin, 28(1):33--39, March 2005.Google Scholar
- G. Cormode and M. Garofalakis. Sketching streams through the net: Distributed approximate query tracking. In International Conference on Very Large Data Bases, 2005. Google ScholarDigital Library
- G. Cormode and M. Garofalakis. Streaming in a connected world: Querying and tracking distributed data streams. In ACM SIGMOD International Conference on Management of Data, 2007. Google ScholarDigital Library
- G. Cormode, M. Garofalakis, S. Muthukrishnan, and R. Rastogi. Holistic aggregates in a networked world: Distributed tracking of approximate quantiles. In ACM SIGMOD International Conference on Management of Data, 2005. Google ScholarDigital Library
- G. Cormode, S. Muthukrishnan, and K. Yi. Algorithms for distributed, functional monitoring. In ACM-SIAM Symposium on Discrete Algorithms, 2008. Google ScholarDigital Library
- G. Cormode, S. Muthukrishnan, K. Yi, and Q. Zhang. Continuous sampling from distributed streams. J. ACM, 59(2):25, 2012. Google ScholarDigital Library
- G. Cormode, S. Muthukrishnan, and W. Zhuang. What's different: Distributed, continuous monitoring of duplicate resilient aggregates on data streams. In IEEE International Conference on Data Engineering, 2006. Google ScholarDigital Library
- G. Cormode, S. Muthukrishnan, and W. Zhuang. Conquering the divide: Continuous clustering of distributed data streams. In IEEE International Conference on Data Engineering, 2007.Google ScholarCross Ref
- G. Cormode and K. Yi. Tracking distributed aggregates over time-based sliding windows. In ACM Conference on Principles of Distributed Computing (PODC), 2011. Google ScholarDigital Library
- A. Das, S. Ganguly, M. Garofalakis, and R. Rastogi. Distributed set-expression cardinality estimation. In International Conference on Very Large Data Bases, 2004. Google ScholarDigital Library
- M. Dilman and D. Raz. Efficient reactive monitoring. In IEEE INFOCOMM, 2001.Google ScholarCross Ref
- J. Feldman, S. Muthukrishnan, A. Sidiropoulos, C. Stein, and Z. Svitkina. On distributing symmetric streaming computations. In ACMSIAM Symposium on Discrete Algorithms, 2008. Google ScholarDigital Library
- S. Ganguly and B. Lakshminath. Estimating entropy over data streams. In European Symposium on Algorithms (ESA), 2006. Google ScholarDigital Library
- N. Giatrakos, A. Deligiannakis, M. N. Garofalakis, I. Sharfman, and A. Schuster. Prediction-based geometric monitoring over distributed data streams. In ACM SIGMOD International Conference on Management of Data, pages 265--276, 2012. Google ScholarDigital Library
- P. Gibbons and S. Tirthapura. Estimating simple functions on the union of data streams. In ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 281--290, 2001. Google ScholarDigital Library
- P. Gibbons and S. Tirthapura. Distributed streams algorithms for sliding windows. In ACM Symposium on Parallel Algorithms and Architectures (SPAA), 2002. Google ScholarDigital Library
- N. J. A. Harvey, J. Nelson, and K. Onak. Sketching and streaming entropy via approximation theory. In IEEE Conference on Foundations of Computer Science, 2008. Google ScholarDigital Library
- K. Heffner and G. Malecha. Design and implementation of generalized functional monitoring, 2009. http://www.people.fas.harvard.edu/¿gmalecha/proj/funkymon.pdf,Google Scholar
- L. Huang, M. N. Garofalakis, A. D. Joseph, and N. Taft. Communication-efficient tracking of distributed cumulative triggers. In ICDCS, 2007. Google ScholarDigital Library
- L. Huang, X. Nguyen, M. Garofalakis, J. Hellerstein, A. D. Joseph, M. Jordan, and N. Taft. Communication-efficient online detection of network-wide anomalies. In IEEE INFOCOMM, 2007.Google ScholarDigital Library
- A. Jain, J. Hellerstein, S. Ratnasamy, and D. Wetherall. A wakeup call for internet monitoring systems: The case for distributed triggers. In Proceedings of the 3rd Workshop on Hot Topics in Networks (Hotnets), 2004.Google Scholar
- N. Jain, M. Dahlin, Y. Zhang, D. Kit, P. Mahajan, and P. Yalagandula. STAR: Self-tuning aggregation for scalable monitoring. In International Conference on Very Large Data Bases, 2007. Google ScholarDigital Library
- R. Keralapura, G. Cormode, and J. Ramamirtham. Communication-efficient distributed monitoring of thresholded counts. In ACM SIGMOD International Conference on Management of Data, 2006. Google ScholarDigital Library
- E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University Press, 1997. Google ScholarDigital Library
- A. Lakhina, M. Crovella, and C. Diot. Mining anomalies using traffic feature distributions. In ACM SIGCOMM, 2005. Google ScholarDigital Library
- Z. Liu, B. Radunovic, and M. Vojnovic. Continuous distributed counting for non-monotonic streams. In ACM Principles of Database Systems, pages 307--318, 2012. Google ScholarDigital Library
- S. Muthukrishnan. Data streams: Algorithms and applications. In ACM-SIAM Symposium on Discrete Algorithms, 2003. Google ScholarDigital Library
- S. Muthukrishnan. Some algorithmic problems and results in compressed sensing. In Allerton Conference, 2006.Google Scholar
- C. Olston, J. Jiang, and J. Widom. Adaptive filters for continuous queries over distributed data streams. In ACM SIGMOD International Conference on Management of Data, 2003. Google ScholarDigital Library
- I. Sharfman, A. Schuster, and D. Keren. A geometric approach to monitoring threshold functions over distributed data streams. In ACM SIGMOD International Conference on Management of Data, 2006. Google ScholarDigital Library
- I. Sharfman, A. Schuster, and D. Keren. Shape sensitive geometric monitoring. In ACM Principles of Database Systems, 2008 Google ScholarDigital Library
- D. Slepian and J. Wolf. Noiseless coding of correlated information sources. IEEE Transactions on on Information Theory, 19:471--480, 1973. Google ScholarDigital Library
- S. Tirthapura and D. P. Woodruff. Optimal random sampling from distributed streams revisited. In DISC, 2011. Google ScholarDigital Library
- D. P. Woodruff and Q. Zhang. Tight bounds for distributed functional monitoring. In ACM Symposium on Theory of Computing, pages 941--960, 2012. Google ScholarDigital Library
- K. Yi and Q. Zhang. Optimal tracking of distributed heavy hitters and quantiles. In ACM Principles of Database Systems, 2009. Google ScholarDigital Library
Index Terms
- The continuous distributed monitoring model
Recommendations
Continuous distributed monitoring: a short survey
AlMoDEP '11: Proceedings of the First International Workshop on Algorithms and Models for Distributed Event ProcessingIn the model of continuous distributed monitoring, a number of observers each see a stream of observations. Their goal is to work together to compute a function of the union of their observations. This can be as simple as counting the total number of ...
Scalable, asynchronous, distributed eigen monitoring of astronomy data streams
In this paper, we develop a distributed algorithm for monitoring the principal components (PCs) for next generation of astronomy petascale data pipelines such as the Large Synoptic Survey Telescopes (LSST). This telescope will take repeated images of ...
Assessment of the orbital variations of GNSS GEO and IGSO satellites for monitoring ionospheric TEC
AbstractThe geostationary orbit (GEO) satellites provide a great opportunity to continuously monitor the earth system, which has shown a powerful application in ionospheric total electron content (TEC) studies. As the GEO satellites revolve the earth over ...
Comments