ABSTRACT
We study static analysis, in particular the containment problem, for analogs of conjunctive queries over XML documents. The problem has been studied for queries based on arbitrary patterns, not necessarily following the tree structure of documents. However, many applications force the syntactic shape of queries to be tree-like, as they are based on proper tree patterns. This renders previous results, crucially based on having non-tree-like features, inapplicable. Thus, we investigate static analysis of queries based on proper tree patterns. We go beyond simple navigational conjunctive queries in two ways: we look at unions and Boolean combinations of such queries as well and, crucially, all our queries handle data stored in documents, i.e., we deal with containment over data trees.
We start by giving a general Πp2 upper bound on the containment of conjunctive queries and Boolean combinations for patterns that involve all types of navigation through documents. We then show matching hardness for conjunctive queries with all navigation, or their Boolean combinations with the simplest form of navigation. After that we look at cases when containment can be witnessed by homomorphisms of analogs of tableaux. These include conjunctive queries and their unions over child and next-sibling axes; however, we show that not all cases of containment can be witnessed by homomorphisms. We look at extending tree patterns used in queries in three possible ways: with wildcard, with schema information, and with data value comparisons. The first one is relatively harmless, the second one tends to increase complexity by an exponential, and the last one quickly leads to undecidability.
- S. Abiteboul, B. Cautis, T. Milo. Reasoning about XML update constraints. In PODS'07, pages 195--204. Google ScholarDigital Library
- S. Abiteboul, L. Segoufin, and V. Vianu. Representing and querying XML with incomplete information. ACM TODS, 31(1):208--254, 2006. Google ScholarDigital Library
- N. Alon, T. Milo, F. Neven, D. Suciu, V. Vianu. XML with data values: typechecking revisited. JCSS 66(4): 688--727 (2003). Google ScholarDigital Library
- S. Amano, L. Libkin, F. Murlak. XML schema mappings. In PODS'09, pages 33--42. Google ScholarDigital Library
- S. Amer-Yahia, S. Cho, L. Lakshmanan, D. Srivastava. Tree pattern query minimization. VLDB J. 11(4): 315--331 (2002). Google ScholarDigital Library
- M. Arenas, P. Barceló, L. Libkin, F. Murlak. Relational and XML Data Exchange. Morgan & Claypool, 2010. Google ScholarDigital Library
- M. Arenas, W. Fan, L. Libkin. On the complexity of verifying consistency of XML specifications. SIAM J. Comput. 38(3): 841--880 (2008). Google ScholarDigital Library
- M. Arenas, L. Libkin. XML data exchange: consistency and query answering. J. ACM 55(2): (2008). Google ScholarDigital Library
- P. Barceló, L. Libkin, A. Poggi, C. Sirangelo. XML with incomplete information. J. ACM, 58:1 (2010). Google ScholarDigital Library
- H. Björklund, W. Martens, T. Schwentick. Optimizing conjunctive queries over trees using schema information. MFCS'08, pages 132--143. Google ScholarDigital Library
- H. Björklund, W. Martens, and T. Schwentick. Conjunctive query containment over trees. JCSS 77(3): 450--472 (2011). Google ScholarDigital Library
- H. Björklund, W. Martens, T. Schwentick. Validity of tree pattern queries with respect to schema information. Unpublished manuscript.Google Scholar
- M. Bojanczyk, C. David, A. Muscholl, T. Schwentick, L. Segoufin. Two-variable logic on words with data. In LICS'06, pages 7--16. Google ScholarDigital Library
- D. Calvanese, G. De Giacomo, M. Lenzerini, and M. Y. Vardi. Regular XPath: constraints, query containment and view-based answering for XML documents. In LID'08.Google Scholar
- A. Chandra and P. Merlin. Optimal implementation of conjunctive queries in relational data bases. In STOC 1977, pages 77--90. Google ScholarDigital Library
- C. David. Complexity of data tree patterns over XML documents. MFCS'08, pages 278--289. Google ScholarDigital Library
- W. Fan, L. Libkin. On XML integrity constraints in the presence of DTDs. J. ACM 49(3): 368--406 (2002). Google ScholarDigital Library
- D. Figueira. Satisfiability of downward XPath with data equality tests. PODS'09, 197--206. Google ScholarDigital Library
- P. Genevés and N. Layaida. A system for the static analysis of XPath. ACM TOIS 24 (2006), 475--502. Google ScholarDigital Library
- A. Gheerbrant, L. Libkin, and T. Tan. On the complexity of query answering over incomplete XML documents. ICDT 2012, 169--181. Google ScholarDigital Library
- G. Gottlob, C. Koch, K. Schulz. Conjunctive queries over trees. J. ACM 53 (2006), 238--272. Google ScholarDigital Library
- T. J. Green. Containment of conjunctive queries on annotated relations. Theory Comput. Syst. 49(2): 429--459 (2011). Google ScholarDigital Library
- A. Halevy. Answering queries using views: A survey. VLDB J. 10(4):270--294 (2001). Google ScholarDigital Library
- B. Kimelfeld, Y. Sagiv. Revisiting redundancy and minimization in an XPath fragment. EDBT'08, pages 61--72. Google ScholarDigital Library
- A. Klug. On conjunctive queries containing inequalities. J. ACM 35(1): 146--160 (1988). Google ScholarDigital Library
- P. Kolaitis, D. Martin, M. Thakur. On the complexity of the containment problem for conjunctive queries with built-in predicates. In PODS 1998, pages 197--204. Google ScholarDigital Library
- M. Lenzerini. Data integration: a theoretical perspective. In PODS'02, pages 233--246. Google ScholarDigital Library
- L. Libkin, C. Sirangelo. Reasoning about XML with temporal logics and automata. J. Applied Logic, 8:2, 210--232 (2010).Google ScholarCross Ref
- W. Martens, F. Neven, T. Schwentick. Simple off the shelf abstractions for XML schema. SIGMOD Record 36(3): 15--22 (2007). Google ScholarDigital Library
- S. Maneth, T. Perst, H. Seidl. Exact XML type checking in polynomial time. In ICDT 2007, pages 254--268. Google ScholarDigital Library
- G. Miklau and D. Suciu. Containment and equivalence for a fragment of XPath. J. ACM, 51(1): 2--45, 2004. Google ScholarDigital Library
- F. Neven. Automata, logic, and XML. In CSL 2002, pages 2--26. Google ScholarDigital Library
- F. Neven, T. Schwentick. On the complexity of XPath containment in the presence of disjunction, DTDs, and variables. LMCS, 2(3): (2006).Google Scholar
- Y. Sagiv, M. Yannakakis. Equivalences among relational expressions with the union and difference operators. J. ACM 27(4): 633--655 (1980). Google ScholarDigital Library
- Th. Schwentick. XPath query containment. SIGMOD Record 33(1): 101--109 (2004). Google ScholarDigital Library
- J. W. Thatcher. Characterizing derivation trees of context-free grammars through a generalization of finite automata theory. JCSS 1 (1967), 317--322. Google ScholarDigital Library
- R. van der Meyden. The complexity of querying indefinite data about linearly ordered domains. JCSS 54(1): 113--135 (1997). Google ScholarDigital Library
- V. Vianu. A web Odyssey: from Codd to XML. In PODS'01, pages 1--15. Google ScholarDigital Library
Index Terms
- Containment of pattern-based queries over data trees
Recommendations
Decidable containment of recursive queries
Database theoryOne of the most important reasoning tasks on queries is checking containment, i.e., verifying whether one query yields necessarily a subset of the result of another one. Query containment is crucial in several contexts, such as query optimization, query ...
Containment of nested XML queries
VLDB '04: Proceedings of the Thirtieth international conference on Very large data bases - Volume 30Query containment is the most fundamental relationship between a pair of database queries: a query Q is said to be contained in a query Q′ if the answer for Q is always a subset of the answer for Q′, independent of the current state of the database. ...
Decidable Containment of Recursive Queries
ICDT '03: Proceedings of the 9th International Conference on Database TheoryOne of the most important reasoning tasks on queries is checking containment, i.e., verifying whether one query yields necessarily a subset of the result of another one. Query containment, is crucial in several contexts, such as query optimization, ...
Comments