|
ABSTRACT
We study the inference of Data Type Definitions (DTDs) for views of XML data, using an abstraction that focuses on document content structure. The views are defined by a query language that produces a list of documents selected from one or more input sources. The selection conditions involve vertical and horizontal navigation, thus querying explicitly the order present in input documents. We point several strong limitations in the descriptive ability of current DTDs and the need for extending them with (i) a subtyping mechanism and (ii) a more powerful specification mechanism than regular languages, such as context-free languages. With these extensions, we show that one can always infer tight DTDs, that precisely characterize a selection view on sources satisfying given DTDs. We also show important special cases where one can infer a tight DTD without requiring extension (ii). Finally we consider related problems such as verifying conformance of a view definition with a predefined DTD. Extensions to more powerful views that construct complex documents are also briefly discussed.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
Abi97
|
|
| |
AM98
|
|
| |
AQM+97
|
S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. Wiener. The LOREL query language for semistructured data. International Journal on Digital Libraries, 1(1), 1997.
|
 |
B+99
|
Chaitan Baru , Amarnath Gupta , Bertram Ludäscher , Richard Marciano , Yannis Papakonstantinou , Pavel Velikhov , Vincent Chu, XML-based information mediation with MIX, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.597-599, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
BDFS97
|
|
 |
BDHS96
|
Peter Buneman , Susan Davidson , Gerd Hillebrand , Dan Suciu, A query language and optimization techniques for unstructured data, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.505-516, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
BKMW98
|
A. Bruggemann-Klein, M. Murata, and D. Wood. Regular tree languages over nonranked alphabets, 1998. Available at ftp:// ftp I I. inf ormat ik. tu-muenchen, de/pub/mis c /caterpillars/.
|
| |
BM99
|
|
| |
BPSM
|
T. Bray, J. Paoli, and C. Sperberg-McQueen. Extensible markup language (XML) 1.0, W3C recommendation. Latest version available at http://www, w3. org. TR/REC-xml.
|
 |
Bun97
|
|
| |
CD
|
J. Clark and S. Deach. Extensible stylesheet language (xsl) 1.0, W3C working draft. http://www, w3. org/TR/WD-xsl.
|
 |
CDSS98
|
Sophie Cluet , Claude Delobel , Jérǒme Siméon , Katarzyna Smaga, Your mediators need data conversion!, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.177-188, June 01-04, 1998, Seattle, Washington, United States
|
 |
CM90
|
|
| |
dBV93
|
J. Van den Bussche and G. Vossen. An extension of path expressions to simplify navigation in object-oriented queries. In Proc. of Intl. Conf. on Deductive and Object-Oriented Databases (DOOD), 1993.
|
| |
DFF+
|
A. Deutch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. XML-QL: A query language for XML. Submission to W3C. Latest version available at http://www, w3. org/TR/NOTE-xml-ql.
|
 |
FFLS98
|
Mary Fernández , Daniela Florescu , Jaewoo Kang , Alon Levy , Dan Suciu, Catching the boat with Strudel: experiences with a Web-site management system, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.414-425, June 01-04, 1998, Seattle, Washington, United States
|
 |
FLS98
|
Daniela Florescu , Alon Levy , Dan Suciu, Query containment for conjunctive queries with regular expressions, Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.139-148, June 01-04, 1998, Seattle, Washington, United States
[doi> 10.1145/275487.275503]
|
| |
FS98
|
|
| |
GJ79
|
|
| |
GS97
|
|
| |
GW97
|
|
| |
HU79
|
|
| |
Inca
|
Bluestone Inc. Visual XML. http://www, bluestone, com/xml/Visual-XML/.
|
| |
Incb
|
SoftQuad Inc. XMetal editor. http://www, sq. com/products/xmetal/.
|
| |
KS95
|
|
| |
LJM+
|
A. Layman, E. Jung, E. Maler, H. Thompson, J. Paoli, J. Tigue, N. Mikula, and S. De Rose. XML-Data. Available at http: //www. w3. org/TR/1998/NOTE-XML-dat a.
|
| |
MD
|
E. Maler and S. DeRose. XML pointer language (XPointer). http://www, w3. org /TR/1998/WD-xpt r- 19980303.
|
| |
Mit90
|
|
| |
Mit96
|
|
| |
MP
|
K. Munroe and Y. Papakonstantinou. BBQ: A visual interface for integrated browsing and querying of XML. Available at http ://www. db. ucsd. edu/publications/ BBO. pdf.
|
 |
MS99
|
|
 |
MSV
|
Tova Milo , Dan Suciu , Victor Vianu, Typechecking for XML transformers, Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.11-22, May 15-18, 2000, Dallas, Texas, United States
[doi> 10.1145/335168.335171]
|
| |
MW95
|
|
| |
MZ98
|
|
 |
NAM98
|
Svetlozar Nestorov , Serge Abiteboul , Rajeev Motwani, Extracting schema from semistructured data, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.295-306, June 01-04, 1998, Seattle, Washington, United States
|
 |
NdB98
|
|
| |
Nev99
|
|
 |
NS99
|
|
| |
NUWC97
|
|
| |
PAGM96
|
|
| |
PV
|
Y. Papakonstantinou and P. Velikhov. The use and computation of specialized DTDs in the MIX mediator system. Manuscript, available at http://www, db. ucsd. edu/ publicat ions/UseComputeDTD, pdf.
|
 |
PV99a
|
|
| |
PV99b
|
Y. Papakonstantinou and P. Velikhov. Enhancing semistructured data mediators with document type definitions. In Proc. ICDE Conf., 1999.
|
| |
SPS99
|
S.Abiteboul, P.Buneman, and D. Suciu. Data on the Web. Morgan Kauffman, 1999.
|
| |
Suc98
|
D. Suciu. Semistructured data and XML. In Proc. 5th International Conference of Foundations of Data Organization (FODO'98), 1998.
|
| |
VLP00
|
|
CITED BY 50
|
|
|
|
|
|
|
|
|
|
Laks V. S. Lakshmanan , Ganesh Ramesh , Hui Wang , Zheng Zhao, On testing satisfiability of tree pattern queries, Proceedings of the Thirtieth international conference on Very large data bases, p.120-131, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
Geert Jan Bex , Wim Martens , Frank Neven , Thomas Schwentick, Expressiveness of XSDs: from practice to theory, there and back again, Proceedings of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Noga Alon , Tova Milo , Frank Neven , Dan Suciu , Victor Vianu, XML with data values: typechecking revisited, Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.138-149, May 2001, Santa Barbara, California, United States
|
|
|
|
Michael Benedikt , Chee-Yong Chan , Wenfei Fan , Juliana Freire , Rajeev Rastogi, Capturing both types and constraints in data integration, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tova Milo , Dan Suciu , Victor Vianu, Typechecking for XML transformers, Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.11-22, May 15-18, 2000, Dallas, Texas, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
|