ACM Home Page
Please provide us with feedback. Feedback
Annotated XML: queries and provenance
Full text pdf formatPdf (333 KB)
Source
Symposium on Principles of Database Systems archive
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems table of contents
Vancouver, Canada
SESSION: XML & hierarchical data table of contents
Pages 271-280  
Year of Publication: 2008
ISBN:978-1-60558-108-8
Authors
J. Nathan Foster  University of Pennsylvania, Philadelphia, PA, USA
Todd J. Green  University of Pennsylvania, Philadelphia, PA, USA
Val Tannen  University of Pennsylvania, Philadelphia, PA, USA
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 45,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376916.1376954
What is a DOI?

ABSTRACT

We present a formal framework for capturing the provenance of data appearing in XQuery views of XML. Building on previous work on relations and their (positive) query languages, we decorate unordered XML with annotations from commutative semirings and show that these annotations suffice for a large positive fragment of XQuery applied to this data. In addition to tracking provenance metadata, the framework can be used to represent and process XML with repetitions, incomplete XML, and probabilistic XML, and provides a basis for enforcing access control policies in security applications.

Each of these applications builds on our semantics for XQuery, which we present in several steps: we generalize the semantics of the Nested Relational Calculus (NRC) to handle semiring-annotated complex values, we extend it with a recursive type and structural recursion operator for trees, and we define a semantics for XQuery on annotated XML by translation into this calculus.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.
 
2
S. Abiteboul, L. Segoufin, and V. Vianu. Representing and querying xml with incomplete information. TODS, 31(1):208--254, 2006.
 
3
S. Abiteboul and P. Senellart. Querying and updating probabilistic information in XML. In EDBT, 2006.
 
4
P. Buneman, J. Cheney, W.-C. Tan, and S. Vansummeren. Curated databases. In PODS, 2008.
 
5
P. Buneman, J. Cheney, and S. Vansummeren. On the expressiveness of implicit provenance in query and update languages. In ICDT, 2007.
 
6
P. Buneman, M. F. Fernandez, and D. Suciu. UnQL: A query language and algebra for semistructured data based on structural recursion. VLDB J., 9(1):76--110, 2000.
 
7
P. Buneman, S. Khanna, and W. C. Tan. Why and where: A characterization of data provenance. In ICDT, 2001.
 
8
P. Buneman, S. A. Naqvi, V. Tannen, and L. Wong. Principles of programming with complex objects and collection types. TCS, 149(1):3--48, 1995.
 
9
P. Buneman and V. Tannen. A structural approach to query language design. In The Functional Approach to Data Management Modeling, Analyzing, and Integrating Heterogenous Data. Springer, 2004.
 
10
J. V. den Bussche, D. V. Gucht, and S. Vansummeren. Well-definedness and semantic type-checking in the nested relational calculus and XQuery. In ICDT, 2005.
 
11
D. Draper, P. Fankhauser, M. F. Fernández, A. Malhotra, K. Rose, M. Rys, J. Siméon, and P. Wadler. XQuery 1.0 and XPath 2.0 Formal Semantics. W3C, Jan. 2007.
 
12
D. Florescu and D. Kossmann. Storing and querying XML data using an RDMBS. IEEE Data Engineering Bulletin, 22(3), 1999.
 
13
J. N. Foster, T. J. Green, and V. Tannen. Annotated XML: Queries and provenance. Technical Report TR-CIS-08-06, University of Pennsylvania, 2008.
 
14
G. Grahne, A. Thomo, and W. W. Wadge. Preferentially annotated regular path queries. In ICDT, 2007.
 
15
T. J. Green, G. Karvounarakis, Z. G. Ives, and V. Tannen. Update exchange with mappings and provenance. In VLDB, 2007.
 
16
T. J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In PODS, 2007.
 
17
J. Hidders, N. Kwasnikowska, J. Sroka, J. Tyszkiewicz, and J. V. den Bussche. A formal model of dataflow repositories. In DILS, 2007.
 
18
E. Hung, L. Getoor, and V. S. Subrahmanian. Probabilistic interval XML. ACM TOCL, 8(4), 2007.
 
19
T. Imieliński and W. Lipski. Incomplete information in relational databases. JACM, 31(4), 1984.
 
20
Y. Kanza, W. Nutt, and Y. Sagiv. Queries with incomplete answers over semistructured data. In PODS, 1999.
 
21
S. K. Lellahi and V. Tannen. A calculus for collections and aggregates. In Category Theory and Computer Science, 1997.
 
22
L. Libkin and L. Wong. Query languages for bags and aggregate functions. JCSS, 55(2):241--272, 1997.
 
23
A. Nierman and H. V. Jagadish. ProTDB: Probabilistic data in XML. In VLDB, 2002.
 
24
D. Olteanu, H. Meuss, T. Furche, and F. Bry. XPath: Looking forward. In EDBT Workshops, 2002.
 
25
C. Ré, J. Simèon, and M. Fernández. A complete and efficient algebraic compiler for xquery. In ICDE, page 14, 2006.
 
26
E. L. Robertson, L. V. Saxton, D. V. Gucht, and S. Vansummeren. Structural recursion on ordered trees and list-based complex objects. In ICDT, 2007.
 
27
P. Senellart and S. Abiteboul. On the complexity of managing probabilistic XML data. In PODS, 2007.
 
28
J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. J. DeWitt, and J. F. Naughton. Relational databases for querying XML documents: Limitations and opportunities. In VLDB J., 1999.
 
29
M. van Keulen, A. de Keijzer, and W. Alink. A probabilistic XML approach to data integration. In ICDE, 2005.

Collaborative Colleagues:
J. Nathan Foster: colleagues
Todd J. Green: colleagues
Val Tannen: colleagues