ABSTRACT
Much research attention has been given to delivering high-performance systems that are capable of complex event processing (CEP) in a wide range of applications. However, many current CEP systems focus on processing efficiently data having a simple structure, and are otherwise limited in their ability to support efficiently complex continuous queries on structured or semi-structured information. However, XML streams represent a very popular form of data exchange, comprising large portions of social network and RSS feeds, financial records, configuration files, and similar applications requiring advanced CEP queries. In this paper, we present the XSeq language and system that support CEP on XML streams, via an extension of XPath that is both powerful and amenable to an efficient implementation. Specifically, the XSeq language extends XPath with natural operators to express sequential and Kleene-* patterns over XML streams, while remaining highly amenable to efficient implementation. XSeq is designed to take full advantage of recent advances in the field of automata on Visibly Pushdown Automata (VPA), where higher expressive power can be achieved without compromising efficiency (whereas the amenability to efficient implementation was not demonstrated in XPath extensions previously proposed).
We illustrate XSeq's power for CEP applications through examples from different domains, and provide formal results on its expressiveness and complexity. Finally, we present several optimization techniques for XSeq queries. Our extensive experiments indicate that XSeq brings outstanding performance to CEP applications: two orders of magnitude improvement are obtained over the same queries executed in general-purpose XML engines.
- M. Alexander, J. Fawcett, and P. Runciman. Nursing practice: hospital and home: the adult. Churchill Livingstone; 2nd edition, 2000.Google Scholar
- R. Alur and P. Madhusudan. Visibly pushdown languages. In STOC, 2004. Google ScholarDigital Library
- R. Alur and P. Madhusudan. Adding nesting structure to words. In Developments in Language Theory, 2006. Google ScholarDigital Library
- R. Bamford and et. al. Xquery reloaded. VLDB, 2009. Google ScholarDigital Library
- C. Barton and et. al. Streaming xpath processing with forward and backward axes. In ICDE, 2003.Google ScholarCross Ref
- P. Boncz and et. al. Monetdb/xquery: a fast xquery processor powered by a relational engine. In SIGMOD, 2006. Google ScholarDigital Library
- L. Brenna, J. Gehrke, M. Hong, and D. Johansen. Distributed event stream processing with non-deterministic finite automata. In DEBS, 2009. Google ScholarDigital Library
- B. t. Cate and C. Lutz. The complexity of query containment in expressive fragments of xpath 2.0. J. ACM, 56(6), 2009. Google ScholarDigital Library
- Y. Chen, S. B. Davidson, and Y. Zheng. An efficient xpath query processor for xml streams. In ICDE, 2006. Google ScholarDigital Library
- Y. Diao, M. Altinel, M. J. Franklin, H. Zhang, and P. Fischer. Path sharing and predicate evaluation for high-performance xml filtering. TODS, 28(4), 2003. Google ScholarDigital Library
- D. Florescu and et. al. The bea/xqrl streaming xquery processor. In VLDB, 2003. Google ScholarDigital Library
- T. Furche, G. Gottlob, G. Grasso, C. Schallhart, and A. J. Sellers. Oxpath: A language for scalable, memory-efficient data extraction from web applications. PVLDB, 4(11), 2011.Google Scholar
- V. Josifovski, M. Fontoura, and A. Barta. Querying xml streams. VLDB Journal, 14(2), 2005. Google ScholarDigital Library
- M. Kay. Ten reasons why saxon xquery is fast. IEEE Data Eng. Bull., 31(4), 2008.Google Scholar
- D. E. Knuth, J. H. M. Jr., and V. R. Pratt. Fast pattern matching in strings. SIAM J. Comput., 6(2), 1977.Google Scholar
- C. Koch. Xml stream processing. In Encyclopedia of Database Systems. 2009.Google Scholar
- N. Laptev, H. Mousavi, A. Shkapsky, and C. Zaniolo. Optimizing Regular Expression Clustering for Massive Pattern Search. Technical report, UCLA, 2012.Google Scholar
- D. C. Luckham. The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems. Addison-Wesley, 2001. Google ScholarDigital Library
- P. Madhusudan and M. Viswanathan. Query automata for nested words. In MFCS, 2009. Google ScholarDigital Library
- M. Marx. Conditional xpath. TODS, 30(4), 2005. Google ScholarDigital Library
- B. Mozafari, K. Zeng, and C. Zaniolo. From regular expressions to nested words: Unifying languages and query execution for relational and xml sequences. PVLDB, 3(1), 2010. Google ScholarDigital Library
- B. Mozafari, K. Zeng, and C. Zaniolo. K*sql: A unifying engine for sequence patterns and xml. In SIGMOD, 2010. Google ScholarDigital Library
- B. Mozafari, K. Zeng, and C. Zaniolo. High-performance complex event processing over xml streams. Technical report, UCLA, 2012.Google Scholar
- D. Olteanu, T. Kiesling, and F. Bry. An evaluation of regular path expressions with qualifiers against xml streams. In ICDE, 2003.Google ScholarCross Ref
- F. Peng and S. S. Chawathe. Xpath queries on streaming data. In SIGMOD, 2003. Google ScholarDigital Library
- C. Pitcher. Visibly pushdown expression effects for xml stream processing. In PLAN-X, 2005.Google Scholar
- A. Schmidt and et. al. Xmark: a benchmark for xml data management. In VLDB, 2002. Google ScholarDigital Library
- R. T. Snodgrass. Tsql2. In Encyclopedia of Database Systems. 2009.Google Scholar
- L. Ströäand S. Schmidt. An extension of xquery for graph analysis of biological pathways. In DBKDA, 2009.Google Scholar
- N. V. Tang. A tighter bound for the determinization of visibly pushdown automata. In INFINITY, 2009.Google Scholar
- B. ten Cate. The expressivity of xpath with transitive closure. In PODS, 2006. Google ScholarDigital Library
- B. ten Cate and M. Marx. Axiomatizing the logical core of xpath 2.0. In ICDT, 2007. Google ScholarDigital Library
- B. ten Cate and M. Marx. Navigational xpath: calculus and algebra. SIGMOD Record, 36(2), 2007. Google ScholarDigital Library
- B. ten Cate and L. Segoufin. Xpath, transitive closure logic, and nested tree walking automata. In PODS, 2008. Google ScholarDigital Library
- Z. Vagena, M. M. Moro, and V. J. Tsotras. Roxsum: Leveraging data aggregation and batch processing for xml routing. In ICDE, 2007.Google ScholarCross Ref
- E. Wu, Y. Diao, and S. Rizvi. High-performance complex event processing over streams. In SIGMOD, 2006. Google ScholarDigital Library
Index Terms
- High-performance complex event processing over XML streams
Recommendations
High-performance complex event processing over hierarchical data
Invited papers issueWhile Complex Event Processing (CEP) constitutes a considerable portion of the so-called Big Data analytics, current CEP systems can only process data having a simple structure, and are otherwise limited in their ability to efficiently support complex ...
Method of Complex Event Processing over XML Streams
ExploreDB '15: Proceedings of the Second International Workshop on Exploratory Search in Databases and the WebThis paper describes a query processing engine for multiple continuous XML data streams with correlated data as a notification mechanism for navigating data exploration. Stream processing, including formal models for stream filtering, union, activation, ...
Comments