skip to main content
10.1145/1097047.1097054acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Exploiting native XML indexing techniques for XML retrieval in relational database systems

Published:04 November 2005Publication History

ABSTRACT

In XML retrieval, two distinct approaches have been established and pursued without much cross-fertilization taking place so far. On the one hand, native XML databases tailored to the semistructured data model have received considerable attention, and a wealth of index structures, join algorithms, tree encodings and query rewriting techniques for XML have been proposed. On the other hand, the question how to make XML fit the relational data model has been studied in great detail, giving rise to a multitude of storage schemes for XML in relational database systems (RDBSs). In this paper we examine how native XML indexing techniques can boost the retrieval of XML stored in an RDBS. We present the Relational CADG (RCADG), an adaptation of several native indexing approaches to the relational model, and show how it supports the evaluation of a clean formal language of conjunctive XML queries. Unlike relational storage schemes for XML, the RCADG largely preserves the underlying tree structure of the data in the RDBS, thus addressing several open problems known from the literature. Experiments show that the RCADG accelerates retrieval by up to two or even three orders of magnitude compared to both native and relational approaches.

References

  1. P. Bohannon, J. Freire, P. Roy, and J. Siméon. From XML Schema to Relations: a Cost-based Approach to XML Storage. In Proc.,18th ICDE Conf., pages 64--75, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J.-M. Bremer and M. Gertz. An Efficient XML Node Identification and Indexing Scheme. Technical Report CSE-2003-04, CS Dept., Univ. of Calif. at Davis, 2003.]]Google ScholarGoogle Scholar
  3. Y. Chen, S. Davidson, C. Hara, and Y. Zheng. RRXS: Redundancy Reducing XML Storage in Relations. In Proc. 29th VLDB Conf., 2003.]]Google ScholarGoogle Scholar
  4. Y. Chen, S. B. Davidson, and Y. Zheng. BLAS : An Efficient XPath Processing System. In Proc. 23rd SIGMOD Conf., pages 47--58, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dig. Bibliogr. & Library Project. verb+dblp.uni-trier.de+.]]Google ScholarGoogle Scholar
  6. A. Deutsch, M. Fernández, and D. Suciu. Storing Semistructured Data with STORED. In Proc. 18th SIGMOD Conf., pages 431--442, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Florescu and D. Kossmann. Storing and Querying XML Data using an RDMBS. IEEE Data Engineering Bulletin, 22(3):27--34, 1999.]]Google ScholarGoogle Scholar
  8. R. Goldman and J. Widom. DataGuides: Enabling Query Formulation and Optimization in Semi-structured Databases. In Proc. 23rd VLDB Conf., pages 436--445, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. Gottlob, C. Koch, and R. Pichler. Efficient Algorithms for Processing XPath Queries. In Proc. 28th VLDB Conf., pages 95--106, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  10. G. Gottlob, C. Koch, and K. U. Schulz. Conjunctive Queries over Trees. In Proc. 23rd ACM Symposium on Principles of Database Systems, pages 547--556, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. Grust. Accelerating XPath location steps. In Proc. 21st SIGMOD Conf., pages 109--120, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Grust, S. Sakr, and J. Teubner. XQuery on SQL Hosts. In Proc. 30th VLDB Conf., 2004.]]Google ScholarGoogle ScholarCross RefCross Ref
  13. P. J. Harding, Q. Li, and B. Moon. XISS/R: XML Indexing and Storage System Using RDBMS. In Proc. 29th VLDB Conf., 2003. Demo.]]Google ScholarGoogle Scholar
  14. IMDb. The Internet Movie Database. verb+www.imdb.org+.]]Google ScholarGoogle Scholar
  15. Initiative for the Evaluation of XML Retrieval 2004.]]Google ScholarGoogle Scholar
  16. H. Jiang, H. Lu, W. Wang, and B. C. Ooi. XR-Tree: Indexing XML Data for Efficient Structural Join. In Proc. 29th ICDE Conf., pages 253--263, 2003.]]Google ScholarGoogle Scholar
  17. H. Jiang, H. Lu, W. Wang, and J. X. Yu. Path Materialization Revisited: An Efficient Storage Model for XML Data. In Australasian Database Conf., 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Kaplan, T. Milo, and R. Shabo. A Comparison of Labeling Schemes for Ancestor Queries. In Proc. 13th Symp. on Discrete Algorithms, pages 271--281, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Krishnamurthy, R. Kaushik, and J. F. Naughton. XML-to-SQL Query Translation Literature: The State Art and Open Problems. In Proc. 1st Int. XML Database Symposium (XSym), pages 1--18, 2003.]]Google ScholarGoogle ScholarCross RefCross Ref
  20. Q. Li and B. Moon. Indexing and Querying XML Data for Regular Path Expressions. In Proc. 27th VLDB Conf., pages 361--370, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Barbosa et al. ToX -- the Toronto XML Engine. In Workshop Inf. Integr. on the Web, pages 66--73, 2001.]]Google ScholarGoogle Scholar
  22. B. Cooper et al. A Fast Index for Semistructured Data. In Proc. 27th VLDB Conf., pages 341--350, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Goldman et al. From Semistructured Data to XML: Migrating the Lore Data Model and Query Language. In Proc. 2nd Int. Workshop on the Web and Databases, pages 25--30, 1999.]]Google ScholarGoogle Scholar
  24. H. Meuss, K. U. Schulz, F. Weigel, S. Leonardi, and F. Bry. Visual Exploration and Retrieval of XML Document Collections with the Generic System $X^2$. Journal of Digital Libraries, 5(1):1--70, 2005.]]Google ScholarGoogle Scholar
  25. T. Milo and D. Suciu. Index Struct. for Path Expressions. In Proc. 7th ICDT Conf., pages 277--295, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. O'Neil, E. O'Neil, and S. Pal et al. ORDPATHs: Insert-Friendly XML Node Labels. In Proc. 23rd SIGMOD Conf., pages 903--908, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Schmidt, M. Kersten, M. Windhouwer, and F. Waas. Efficient Relational Storage and Retrieval of XML Documents. Proc. 3th Int. Workshop on the Web and Databases, pages 47--52, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. J. DeWitt, and J. F. Naughton. Relational Databases for Querying XML Documents: Limitations and Opportunities. In Proc. 25th VLDB Conf., 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. I. Tatarinov, S. Viglas, K. S. Beyer, and J. S. sundaram et al. Storing and Querying Ordered XML Using a Relational Database System. In Proc. 21st SIGMOD Conf., pages 204--215, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. F. Weigel, H. Meuss, F. Bry, and K. U. Schulz. Content-Aware DataGuides: Interleaving IR and DB Indexing Techniques for Efficient Retrieval of Textual XML Data. In Proc. 26th Europ. Conf. on Information Retrieval, pages 378--393, 2004.]]Google ScholarGoogle ScholarCross RefCross Ref
  31. F. Weigel, K. U. Schulz, and H. Meuss. Exploiting Native XML Indexing Techniques for XML Retrieval in RDBSs. Technical report, 2005. +www.cis.uni-++muenchen.de/ weigel/Literatur/weigel05rcadgtech.pdf+.]]Google ScholarGoogle Scholar
  32. F. Weigel, K. U. Schulz, and H. Meuss. The BIRD Numbering Scheme for XML and Tree Databases -- Deciding and Reconstructing Tree Relations using Efficient Arithmetic Operations. In Proc. 3rd Int. XML Database Symposium (XSym), 2005.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. XMark. XML Benchmark. www.xml-benchmark.org.]]Google ScholarGoogle Scholar
  34. M. Yoshikawa and T. Amagasa et al. XRel: A Path-Based Approach to Storage and Retrieval of XML Documents Using Relational Databases. Transactions on Internet Technology, 1(1):110--141, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. C. Zhang, J. F. Naughton, and D. J. DeWitt et al. On Supporting Containment Queries in RDBMS. In Proc. 20th SIGMOD Conf., pages 425--436, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploiting native XML indexing techniques for XML retrieval in relational database systems

                            Recommendations

                            Comments

                            Login options

                            Check if you have access through your login credentials or your institution to get full access on this article.

                            Sign in

                            PDF Format

                            View or Download as a PDF file.

                            PDF

                            eReader

                            View online with eReader.

                            eReader