skip to main content
10.1145/1966357.1966363acmotherconferencesArticle/Chapter ViewAbstractPublication PageslidConference Proceedingsconference-collections
research-article

The logic of query languages for data streams

Published:25 March 2011Publication History

ABSTRACT

Data Stream Management Systems (DSMS) represent a vibrant research area that is rich in technical challenges, which many projects have approached by extending database query languages and models for continuous queries on data streams [1, 3, 4, 9, 5]. These database-inspired approaches have delivered remarkable systems and applications, but have yet to produce solid conceptual foundations for DSMS data models and query languages---particularly if we compare with the extraordinary ones of relational databases. A cornerstone of the success of relational databases was the tight coupling between their data model and their logic-based query languages. In this paper, we show that a similar approach can succeed for data streams and propose a tight-coupled design for DSMS data models and query languages. To express more naturally the behavior of a data stream and attain more powerful on-line queries, we abandon the set-of-tuples model of relational databases, and instead use sequences of tuples ordered by their time-stamps as our data stream model. This approach allows us to overcome the blocking problem that severely impairs the expressive power of data stream query languages. As elucidated in [1]: A blocking query operator is one that cannot produce the first tuple of the output until it has seen the entire input. Previous work had characterized blocking query operators by their non-monotonic behavior [7, 6, 8]. In this paper, we instead use the closed-world assumption [11, 10] to characterize blocking/nonblocking behaviors with respect to the incompleteness/completeness of the streaming database. From this, we infer simple syntactic conditions that make Datalog rules immune from blocking. A significant and surprising new result is that the use of negated goals in the bodies of rules does not imply a blocking behavior: in fact, many very useful nonblocking queries can be expressed using negation. The flip side of this exciting result is that additional conditions must then be imposed on the rules to ensure that (i) the results produced by Datalog programs are ordered according to their time-stamps, and (ii) possible time-skews between streams are also managed explicitly by the rules [2]. These problems, and their possible remedies, are captured and expressed quite naturally using Datalog, which thus emerges as a powerful framework for analyzing and expressing continuous queries. Related problems, including the treatment of data streams without time-stamps, the characterization of monotonic query operators [7, 6, 8], and the use of more general closed-world assumptions were also studied and answered in the course of this research [12].

References

  1. B. Babcock, S. Babu, M. Datar, R. Motawani, and J. Widom. Models and issues in data stream systems. In PODS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Y. Bai, H. Thakkar, H. Wang, and C. Zaniolo. Timestamp management and query execution models in data stream management systems. IEEE Internet Computing, 12(6):13--21, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Chandrasekaran and M. Franklin. Streaming queries over streaming data. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Cranor, Y. Gao, T. Johnson, V. Shkapenyuk, and O. Spatscheck. Gigascope: High performance network monitoring with an SQL interface. In SIGMOD, page 623. ACM Press, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Golab and M. T. Özsu. Issues in data stream management. ACM SIGMOD Record, 32(2):5--14, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y. Gurevich, D. Leinders, and J. V. den Bussche. A theory of stream queries. In Database Programming Languages, 11th International Symposium, DBPL 2007, pages 153--168. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Y.-N. Law, H. Wang, and C. Zaniolo. Data models and query language for data streams. In VLDB, pages 492--503, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y.-N. Law, H. Wang, and C. Zaniolo. Relational languages and data models for continuous queries on sequences and data streams. ACM Transactions on Database Systems-to appear, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Madden, M. A. Shah, J. M. Hellerstein, and V. Raman. Continuously adaptive continuous queries over streams. In SIGMOD, pages 49--61, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Rajasekar, J. Lobo, and J. Minker. Weak generalized closed world assumption. J. Autom. Reasoning, 5(3):293--307, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Reiter. Deductive question-answering on relational data bases. In H. Gallaire and J. Minker, editors, Logic and Data Bases, Symposium on Logic and Data Bases, Toulouse, 1977, pages 149--177, 1977.Google ScholarGoogle Scholar
  12. C. Zaniolo. Bringing logic and order to data stream management systems. UCLA Technical Report, 2011.Google ScholarGoogle Scholar

Index Terms

  1. The logic of query languages for data streams

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              LID '11: Proceedings of the 4th International Workshop on Logic in Databases
              March 2011
              63 pages
              ISBN:9781450306096
              DOI:10.1145/1966357

              Copyright © 2011 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 25 March 2011

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader