ACM Home Page
Please provide us with feedback. Feedback
Approximate XML query answers
Full text PdfPdf (261 KB)
Source International Conference on Management of Data archive
Proceedings of the 2004 ACM SIGMOD international conference on Management of data table of contents
Paris, France
SESSION: Research sessions: new styles of XML table of contents
Pages: 263 - 274  
Year of Publication: 2004
ISBN:1-58113-859-8
Authors
Neoklis Polyzotis  University of California Santa Cruz
Minos Garofalakis  Bell Labs, Lucent Technologies
Yannis Ioannidis  University of Athens, Hellas
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 71,   Citation Count: 15
Additional Information:

abstract   references   cited by   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1007568.1007599
What is a DOI?

ABSTRACT

The rapid adoption of XML as the standard for data representation and exchange foreshadows a massive increase in the amounts of XML data collected, maintained, and queried over the Internet or in large corporate data-stores. Inevitably, this will result in the development of on-line decision support systems, where users and analysts interactively explore large XML data sets through a declarative query interface (e.g., XQuery or XSLT). Given the importance of remaining interactive, such on-line systems can employ approximate query answers as an effective mechanism for reducing response time and providing users with early feedback. This approach has been successfully used in relational systems and it becomes even more compelling in the XML world, where the evaluation of complex queries over massive tree-structured data is inherently more expensive.In this paper, we initiate a study of approximate query answering techniques for large XML databases. Our approach is based on a novel, conceptually simple, yet very effective XML-summarization mechanism: TREESKETCH synopses. We demonstrate that, unlike earlier techniques focusing solely on selectivity estimation, our TREESKETCH synopses are much more effective in capturing the complete tree structure of the underlying XML database. We propose novel construction algorithms for building TREESKETCH summaries of limited size, and describe schemes for processing general XML twig queries over a concise TREESKETCH in order to produce very fast, approximate tree-structured query answers. To quantify the quality of such approximate answers, we propose a novel, intuitive error metric that captures the quality of the approximation in terms of both the overall structure of the XML tree and the distribution of document edges. Experimental results on real-life and synthetic data sets verify the effectiveness of our TREESKETCH synopses in producing fast, accurate approximate answers and demonstrate their benefits over previously proposed techniques that focus solely on selectivity estimation. In particular, TREESKETCHes yield faster, more accurate approximate answers and selectivity estimates, and are more efficient to construct. To the best of our knowledge, ours is the first work to address the timely problem of producing fast, approximate tree-structured answers for complex XML queries.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
P. Buneman, M. Grohe, and C. Koch. "Path Queries on Compressed XML". In Proceedings of the 29th Intl. Conf. on Very Large Data Bases, 2003.
 
3
 
4
Don Chamberlin, James Clark, Daniela Florescu, Jonathan Robie, Jérôme Siméon, and Mugur Stefanescu. "XQuery 1.0: An XML Query Language". W3C Working Draft, 2001.
 
5
Zhimin Chen, H. V. Jagadish, Laks V. S. Laksmanan, and Stelios Paparizos. "From Tree Patterns to Generalized Tree Patterns: On Efficient Eavluation of XQuery". In Proceedings of the 29th Intl. Conf. on Very Large Data Bases, 2003.
 
6
 
7
James Clark. "XSL Transformations (XSLT), Version 1.0". W3C Recommendation, November 1999.
 
8
James Clark and Steve DeRose. "XML Path Language (XPath), Version 1.0". W3C Recommendation, November 1999.
9
 
10
 
11
 
12
L. Lim, M. Wang, S. Padmanabhan, J. S. Vitter, and R. Parr. XPathLearner: An On-Line Self-Tuning Markov Histogram for XML Path Selectivity Estimation. In Proceedings of the 28th Intl. Conf. on Very Large Data Bases, 2002.
 
13
 
14
15
 
16
N. Polyzotis and M. Garofalakis. "Structure and Value Synopses for XML Data Graphs". In Proceedings of the 28th Intl. Conf. on Very Large Data Bases, 2002.
 
17
Neoklis Polyzotis, Minos Garofalakis, and Yannis Ioannidis. Approximate XML Query Answers. 2004.
 
18
 
19
 
20
21
 
22
23

CITED BY  15
 
 
 
 
 
 
 
Collaborative Colleagues:
Neoklis Polyzotis: colleagues
Minos Garofalakis: colleagues
Yannis Ioannidis: colleagues

Peer to Peer - Readers of this Article have also read: