skip to main content
10.1145/2213836.2213866acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

High-performance complex event processing over XML streams

Authors Info & Claims
Published:20 May 2012Publication History

ABSTRACT

Much research attention has been given to delivering high-performance systems that are capable of complex event processing (CEP) in a wide range of applications. However, many current CEP systems focus on processing efficiently data having a simple structure, and are otherwise limited in their ability to support efficiently complex continuous queries on structured or semi-structured information. However, XML streams represent a very popular form of data exchange, comprising large portions of social network and RSS feeds, financial records, configuration files, and similar applications requiring advanced CEP queries. In this paper, we present the XSeq language and system that support CEP on XML streams, via an extension of XPath that is both powerful and amenable to an efficient implementation. Specifically, the XSeq language extends XPath with natural operators to express sequential and Kleene-* patterns over XML streams, while remaining highly amenable to efficient implementation. XSeq is designed to take full advantage of recent advances in the field of automata on Visibly Pushdown Automata (VPA), where higher expressive power can be achieved without compromising efficiency (whereas the amenability to efficient implementation was not demonstrated in XPath extensions previously proposed).

We illustrate XSeq's power for CEP applications through examples from different domains, and provide formal results on its expressiveness and complexity. Finally, we present several optimization techniques for XSeq queries. Our extensive experiments indicate that XSeq brings outstanding performance to CEP applications: two orders of magnitude improvement are obtained over the same queries executed in general-purpose XML engines.

References

  1. M. Alexander, J. Fawcett, and P. Runciman. Nursing practice: hospital and home: the adult. Churchill Livingstone; 2nd edition, 2000.Google ScholarGoogle Scholar
  2. R. Alur and P. Madhusudan. Visibly pushdown languages. In STOC, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Alur and P. Madhusudan. Adding nesting structure to words. In Developments in Language Theory, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Bamford and et. al. Xquery reloaded. VLDB, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Barton and et. al. Streaming xpath processing with forward and backward axes. In ICDE, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  6. P. Boncz and et. al. Monetdb/xquery: a fast xquery processor powered by a relational engine. In SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. Brenna, J. Gehrke, M. Hong, and D. Johansen. Distributed event stream processing with non-deterministic finite automata. In DEBS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. t. Cate and C. Lutz. The complexity of query containment in expressive fragments of xpath 2.0. J. ACM, 56(6), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Chen, S. B. Davidson, and Y. Zheng. An efficient xpath query processor for xml streams. In ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Diao, M. Altinel, M. J. Franklin, H. Zhang, and P. Fischer. Path sharing and predicate evaluation for high-performance xml filtering. TODS, 28(4), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Florescu and et. al. The bea/xqrl streaming xquery processor. In VLDB, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Furche, G. Gottlob, G. Grasso, C. Schallhart, and A. J. Sellers. Oxpath: A language for scalable, memory-efficient data extraction from web applications. PVLDB, 4(11), 2011.Google ScholarGoogle Scholar
  13. V. Josifovski, M. Fontoura, and A. Barta. Querying xml streams. VLDB Journal, 14(2), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Kay. Ten reasons why saxon xquery is fast. IEEE Data Eng. Bull., 31(4), 2008.Google ScholarGoogle Scholar
  15. D. E. Knuth, J. H. M. Jr., and V. R. Pratt. Fast pattern matching in strings. SIAM J. Comput., 6(2), 1977.Google ScholarGoogle Scholar
  16. C. Koch. Xml stream processing. In Encyclopedia of Database Systems. 2009.Google ScholarGoogle Scholar
  17. N. Laptev, H. Mousavi, A. Shkapsky, and C. Zaniolo. Optimizing Regular Expression Clustering for Massive Pattern Search. Technical report, UCLA, 2012.Google ScholarGoogle Scholar
  18. D. C. Luckham. The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems. Addison-Wesley, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. Madhusudan and M. Viswanathan. Query automata for nested words. In MFCS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Marx. Conditional xpath. TODS, 30(4), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Mozafari, K. Zeng, and C. Zaniolo. From regular expressions to nested words: Unifying languages and query execution for relational and xml sequences. PVLDB, 3(1), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Mozafari, K. Zeng, and C. Zaniolo. K*sql: A unifying engine for sequence patterns and xml. In SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Mozafari, K. Zeng, and C. Zaniolo. High-performance complex event processing over xml streams. Technical report, UCLA, 2012.Google ScholarGoogle Scholar
  24. D. Olteanu, T. Kiesling, and F. Bry. An evaluation of regular path expressions with qualifiers against xml streams. In ICDE, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  25. F. Peng and S. S. Chawathe. Xpath queries on streaming data. In SIGMOD, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Pitcher. Visibly pushdown expression effects for xml stream processing. In PLAN-X, 2005.Google ScholarGoogle Scholar
  27. A. Schmidt and et. al. Xmark: a benchmark for xml data management. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. T. Snodgrass. Tsql2. In Encyclopedia of Database Systems. 2009.Google ScholarGoogle Scholar
  29. L. Ströäand S. Schmidt. An extension of xquery for graph analysis of biological pathways. In DBKDA, 2009.Google ScholarGoogle Scholar
  30. N. V. Tang. A tighter bound for the determinization of visibly pushdown automata. In INFINITY, 2009.Google ScholarGoogle Scholar
  31. B. ten Cate. The expressivity of xpath with transitive closure. In PODS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. B. ten Cate and M. Marx. Axiomatizing the logical core of xpath 2.0. In ICDT, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. ten Cate and M. Marx. Navigational xpath: calculus and algebra. SIGMOD Record, 36(2), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. B. ten Cate and L. Segoufin. Xpath, transitive closure logic, and nested tree walking automata. In PODS, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Z. Vagena, M. M. Moro, and V. J. Tsotras. Roxsum: Leveraging data aggregation and batch processing for xml routing. In ICDE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  36. E. Wu, Y. Diao, and S. Rizvi. High-performance complex event processing over streams. In SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. High-performance complex event processing over XML streams

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '12: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
      May 2012
      886 pages
      ISBN:9781450312479
      DOI:10.1145/2213836

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 May 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGMOD '12 Paper Acceptance Rate48of289submissions,17%Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader