skip to main content
research-article

On the minimization of XPath queries

Published: 22 February 2008 Publication History

Abstract

XPath expressions define navigational queries on XML data and are issued on XML documents to select sets of element nodes. Due to the wide use of XPath, which is embedded into several languages for querying and manipulating XML data, the problem of efficiently answering XPath queries has received increasing attention from the research community. As the efficiency of computing the answer of an XPath query depends on its size, replacing XPath expressions with equivalent ones having the smallest size is a crucial issue in this direction. This article investigates the minimization problem for a wide fragment of XPath (namely X P[✶]), where the use of the most common operators (child, descendant, wildcard and branching) is allowed with some syntactic restrictions. The examined fragment consists of expressions which have not been specifically studied in the relational setting before: neither are they mere conjunctive queries (as the combination of “//” and “*” enables an implicit form of disjunction to be expressed) nor do they coincide with disjunctive ones (as the latter are more expressive). Three main contributions are provided. The “global minimality” property is shown to hold: the minimization of a given XPath expression can be accomplished by removing pieces of the expression, without having to re-formulate it (as for “general” disjunctive queries). Then, the complexity of the minimization problem is characterized, showing that it is the same as the containment problem. Finally, specific forms of XPath expressions are identified, which can be minimized in polynomial time.

References

[1]
Abiteboul, S. 1989. Boundedness is undecidable for datalog programs with a single recursive rule. Inf. Process. Lett. 32, 6, 281--287.
[2]
Amer-Yahia, S., Cho, S., Lakshmanan, L. K. S., and Srivastava, D. 2002. Tree pattern query minimization. VLDB J. 11, 4, 315--331.
[3]
Benedikt, M., Fan, W., and Kuper, G. M. 2003. Structural properties of XPath fragments. In Proceedings of the 9th International Conference on Database Theory (ICDT), Siena, Italy. 79--95.
[4]
Chan, C. Y., Fan, W., and Zeng, Y. 2004. Taming XPath queries by minimizing wildcard steps. In Proceedings of the 30th International Conference on Very Large Data Bases (VLDB), Toronto, Ont., Canada. ACM, New York, 156--167.
[5]
Chandra, A. K., and Merlin, P. M. 1977. Optimal implementation of conjunctive queries in relational databases. In Proceedings of the 9th ACM Symposium on Theory of Computing (STOC), (New York, NY). ACM, New York, 77--90.
[6]
Chekuri, C., and Rajaraman, A. 1997. Conjunctive query containment revisited. In Proceedings of the 6th International Conference on Database Theory (ICDT), Delphi, Greece. 56--70.
[7]
Deutsch, A., and Tannen, V. 2001. Containment and Integrity constraints for XPath fragments. In Proceedings of the 8th International Workshop on Knowledge Representation Meets Databases (KRDB), Rome, Italy.
[8]
Gottlob, G., Koch, C., and Pichler, R. 2002. Efficient algorithms for processing XPath queries. In Proceedings of the 28th International Conference on Very Large Databases (VLDB), Hong Kong, China. ACM, New York, 95--106.
[9]
Kimelfeld, B., and Sagiv, Y. 2008. Revisiting redundancy and minimization in an XPath fragment. In Proceedings of the 11th International Conference on Extending Database Technology (EDBT) (Nantes, France). To appear.
[10]
Kolaitis, P. G., and Vardi, M. 2000. Conjunctive-query containment and constraint satisfaction. J. Comput. Syst. Sci. 61, 302--332.
[11]
Miklau, G., and Suciu, D. 2004. Containment and Equivalence for an XPath Fragment. J. ACM 51, 1, 2--45.
[12]
Milo, T., and Suciu, D. 1999. Index structures for path expressions. In Proceedings of the 7th International Conference on Database Theory (ICDT) (Jerusalem, Israel). 277--295.
[13]
Neven, F., and Schwentick, T. 2003. XPath containment in the presence of disjunction, DTDs, and variables. In Proceedings of the 9th International Conference on Database Theory (ICDT) (Siena, Italy). 315--329.
[14]
Ramanan, P. 2002. Efficient algorithms for minimizing tree pattern queries. In Proceedings of the of ACM SIGMOD Conference on Management of Data (SIGMOD) (Madison, WI). ACM, New York, 299--309.
[15]
Sagiv, Y., and Yannakakis, M. 1980. Equivalence between relational expressions with the union and difference operators. J. ACM 27, 4, 633--655.
[16]
Schwentick, T. 2004. XPath query containment. ACM SIGMOD Record 33, 1, 101--109.
[17]
Shmueli, O. 1993. Equivalence of datalog queries is undecidable. J. Logic Program. 15, 3, 231--241.
[18]
Wood, P. T. 2000. On the equivalence of XML patterns. In Proceedings of the 1st International Conference on Computational Logic (CL) (London, UK), 1152--1166.
[19]
Wood, P. T. 2001. Minimizing simple XPath expressions. In Proceedings of the 4th International Workshop on the Web and Databases (WebDB) (Santa Barbara, CA). 13--18.
[20]
Wood, P. T. 2003. Containment for XPath fragments under DTD constraints. In Proceedings of the 9th International Conference on Database Theory (ICDT) (Siena, Italy), 300--314.
[21]
XLi. 2004. XML Linking Language (XLink) Version 1.0---W3C Recommendation. In http://www.w3.org/TR/2001/REC-xlink-20010627/.
[22]
XML. 2004. Extensible Markup Language (XML) 1.0 (Third Edition)---W3C Recommendation. In http://www.w3c.org/TR/2004/REC-xml-20040204/.
[23]
XPa. 1999. XML Path Language (XPath) Version 1.0---W3C Recommendation. In http://www.w3.org/TR/xpath.
[24]
XPo. 2002. XML Pointer Language (XPointer)---W3C Working Draft. In http://www.w3.org/TR/2002/WD-xptr-20020816/.
[25]
XQu. 2004. XQuery 1.0: An XML Query Language---W3C Working Draft. In http://www.w3.org/TR/xquery/.
[26]
XSc. 2001. XML Schema Part 0: Primer---W3C Recommendation. In http://www.w3.org/TR/xmlschema-0/.
[27]
XSLT. 1999. XSL Transformations (XSLT) Version 1.0---W3C Recommendation. In http://www. w3.org/TR/xslt.
[28]
Yannakakis, M. 1981. Algorithms for acyclic database schemes. In Proceedings of the 7th International Conference on Very Large Databases (VLDB) (Cannes, France), ACM, New York, 82--94.

Cited By

View all
  • (2023)Solving the SPARQL query containment problem with SpeCSWeb Semantics: Science, Services and Agents on the World Wide Web10.1016/j.websem.2022.10077076:COnline publication date: 1-Apr-2023
  • (2023)Sentimental Analysis of COVID-19 Vaccine Tweets Using BERT+NBSVMMachine Learning and Principles and Practice of Knowledge Discovery in Databases10.1007/978-3-031-23618-1_16(238-247)Online publication date: 31-Jan-2023
  • (2022)Applications of Majority Judgement for Winner Selection in Eurovision Song ContestProceedings of the 26th International Database Engineered Applications Symposium10.1145/3548785.3548791(113-119)Online publication date: 22-Aug-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 55, Issue 1
February 2008
158 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/1326554
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2008
Accepted: 01 November 2007
Revised: 01 June 2006
Received: 01 October 2004
Published in JACM Volume 55, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Query containment
  2. XPath expressions
  3. query minimization
  4. tree pattern matching

Qualifiers

  • Research-article
  • Research
  • Pre-selected

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Solving the SPARQL query containment problem with SpeCSWeb Semantics: Science, Services and Agents on the World Wide Web10.1016/j.websem.2022.10077076:COnline publication date: 1-Apr-2023
  • (2023)Sentimental Analysis of COVID-19 Vaccine Tweets Using BERT+NBSVMMachine Learning and Principles and Practice of Knowledge Discovery in Databases10.1007/978-3-031-23618-1_16(238-247)Online publication date: 31-Jan-2023
  • (2022)Applications of Majority Judgement for Winner Selection in Eurovision Song ContestProceedings of the 26th International Database Engineered Applications Symposium10.1145/3548785.3548791(113-119)Online publication date: 22-Aug-2022
  • (2022)Learning Disjunctive Multiplicity Expressions and Disjunctive Generalize Multiplicity Expressions From Both Positive and Negative ExamplesThe Computer Journal10.1093/comjnl/bxac03766:7(1733-1748)Online publication date: 18-Apr-2022
  • (2022)Human sentiments monitoring during COVID-19 using AI-based modelingProcedia Computer Science10.1016/j.procs.2022.07.112203:C(753-758)Online publication date: 1-Jan-2022
  • (2018)Minimization of Tree PatternsJournal of the ACM10.1145/318028165:4(1-46)Online publication date: 25-Jul-2018
  • (2018)An Efficient Approach for Monitoring and Analyzing Real-Time ARINC 661 Events2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC)10.1109/DASC.2018.8569329(1-5)Online publication date: Sep-2018
  • (2018)XMinWorld Wide Web10.1007/s11280-010-0089-x13:3(343-371)Online publication date: 25-Dec-2018
  • (2017)Optimizing Tree Patterns for Querying Graph- and Tree-Structured DataACM SIGMOD Record10.1145/3093754.309375946:1(15-22)Online publication date: 12-May-2017
  • (2017)Technical PerspectiveACM SIGMOD Record10.1145/3093754.309375846:1(14-14)Online publication date: 12-May-2017
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media