ACM Home Page
Please provide us with feedback. Feedback
Query efficiency in probabilistic XML models
Full text PdfPdf (1.04 MB)
Source
International Conference on Management of Data archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data table of contents
Vancouver, Canada
SESSION: Research Session 15: Probabilistic I table of contents
Pages 701-714  
Year of Publication: 2008
ISBN:978-1-60558-102-6
Authors
Benny Kimelfeld  The Hebrew University of Jerusalem, Jerusalem, Israel
Yuri Kosharovsky  The Hebrew University of Jerusalem, Jerusalem, Israel
Yehoshua Sagiv  The Hebrew University of Jerusalem, Jerusalem, Israel
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 39,   Downloads (12 Months): 114,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376616.1376687
What is a DOI?

ABSTRACT

Various known models of probabilistic XML can be represented as instantiations of abstract p-documents. Such documents have, in addition to ordinary nodes, distributional nodes that specify the probabilistic process of generating a random document. Within this abstraction, families of pdocuments, which are natural extensions and combinations of previous models, are considered. The focus is on efficiency of applying twig queries (with projection) to p-documents. A closely related issue is the ability to (efficiently) translate a given document of one family into another family. Furthermore, both of these tasks have two variants that correspond to the value-based and object-based semantics.

The translation relationships among different families of p-documents are studied. An efficient algorithm for evaluating twig queries over one specific family is given. This algorithm generalizes a known algorithm and significantly improves its running time, both analytically and experimentally. It is shown that this family is the maximal, among the ones considered, for which query evaluation is tractable. For the rest, efficient approximate algorithms for query evaluation are presented.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
S. Abiteboul and P. Senellart. Querying and updating probabilistic information in XML. In EDBT, 2006.
 
2
N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: optimal XML pattern matching. In SIGMOD, 2002.
 
3
S. Cohen, B. Kimelfeld, and Y. Sagiv. Incorporating constraints in probabilistic XML. In PODS, 2008.
 
4
N. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. In VLDB, 2004.
 
5
N. N. Dalvi and D. Suciu. The dichotomy of conjunctive queries on probabilistic structures. In PODS, 2007.
 
6
S. Flesca, F. Furfaro, and E. Masciari. On the minimization of Xpath queries. In VLDB, 2003.
 
7
E. Grädel, Y. Gurevich, and C. Hirsch. The complexity of query reliability. In PODS, 1998.
 
8
E. Hung, L. Getoor, and V. S. Subrahmanian. Probabilistic interval XML. In ICDT, 2003.
 
9
E. Hung, L. Getoor, and V. S. Subrahmanian. PXML: A probabilistic semistructured data model and algebra. In ICDE, 2003.
 
10
D. S. Johnson, M. Yannakakis, and C. H. Papadimitriou. On generating all maximal independent sets. Information Processing Letters, 27, March 1988.
 
11
R. M. Karp, M. Luby, and N. Madras. Monte-carlo approximation algorithms for enumeration problems. J. Algorithms, 10(3), 1989.
 
12
B. Kimelfeld and Y. Sagiv. Matching twigs in probabilistic XML. In VLDB, 2007.
 
13
B. Kimelfeld and Y. Sagiv. Maximally joining probabilistic data. In PODS, 2007.
 
14
B. Kimelfeld and Y. Sagiv. Revisiting redundancy and minimization in an XPath fragment. In EDBT, 2008.
 
15
G. Miklau and D. Suciu. Containment and equivalence for an XPath fragment. In PODS, Madison, Wisconsin, USA, 2002.
 
16
A. Nierman and H. V. Jagadish. ProTDB: Probabilistic data in XML. In VLDB, 2002.
 
17
J. S. Provan and M. O. Ball. The complexity of counting cuts and of computing the probability that a graph is connected. SIAM J. Comput., 12(4), 1983.
 
18
C. Re, N. N. Dalvi, and D. Suciu. Efficient top-k query evaluation on probabilistic data. In ICDE, 2007.
 
19
C. Re and D. Suciu. Efficient evaluation of HAVING queries on a probabilistic database. In DBPL, 2007.
 
20
C. Ré and D. Suciu. Efficient evaluation of HAVING queries on a probabilistic database. In DBPL, 2007.
 
21
P. Senellart. Comprendre le Web caché. Understanding the Hidden Web. PhD thesis, Université Paris-Sud 11, Orsay, France, Dec. 2007.
 
22
P. Senellart and S. Abiteboul. On the complexity of managing probabilistic XML data. In PODS, 2007.
 
23
S. Toda and M. Ogiwara. Counting classes are at least as hard as the polynomial-time hierarchy. SIAM J. Comput., 21(2), 1992.
 
24
L. G. Valiant. The complexity of computing the permanent. Theor. Comput. Sci., 8, 1979.
 
25
D. S. Warren. Memoing for logic programs. Commun. ACM, 35(3), 1992.
 
26
S. Zachos. Probabilistic quantifiers and games. J. Comput. Syst. Sci., 36(3), 1988.

Collaborative Colleagues:
Benny Kimelfeld: colleagues
Yuri Kosharovsky: colleagues
Yehoshua Sagiv: colleagues