ACM Home Page
Please provide us with feedback. Feedback
Parsing concurrent XML
Full text PdfPdf (242 KB)
Source Workshop On Web Information And Data Management archive
Proceedings of the 6th annual ACM international workshop on Web information and data management table of contents
Washington DC, USA
SESSION: XML processing table of contents
Pages: 23 - 30  
Year of Publication: 2004
ISBN:1-58113-978-0
Authors
Ionut E. Iacob  University of Kentucky
Alex Dekhtyar  University of Kentucky
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 72,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1031453.1031460
What is a DOI?

ABSTRACT

Concurrent markup hierarchies appear often in document-centric XML documents, as a result of different XML elements having overlapping scopes. They require significantly different approach to management and maintenance. Management of XML documents composed of concurrent markup has been mostly studied by the document processing community and has attracted attention of computer scientists only recently. In this paper we discuss the architecture of an XML parser for concurrent XML. This parser uses a GODDAG data structure in place of traditional DOM Tree to store concurrent markup on top of the document content and provides a DOM-like API that allows software developers of tools working with concurrent XML documents to use it instead of parsing each individual component with a traditional DOM XML parser. The paper describes the architecture of the parser, data structures and algorithms used and the DOM-like API.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Xerces2 Java Parser 2.6.2. http://xml.apache.org/xerces2-j/, 2004. Apache XML project.
 
2
Alfred. Boethius. British Library MS Cotton Otho A. vi. Manuscript, folio 36v.
 
3
T. Bray, J. Paoli, C. M. Sperberg-McQueen, and E. Maler(Eds.). Extensible Markup Language (XML) 1.0 (Second Edition). http://www.w3.org/TR/REC-xml, Oct 2000. W3C, REC-xml-20001006.
 
4
M. Champion, S. Byrne, G. Nicol, and L. Wood(Eds.). Document Object Model (DOM) Level 1 Specification. http://www.w3.org/TR/REC-DOM-Level-1/, Oct 1998. World Wide Web Consortium Recommendation, REC-DOM-Level-1-19981001.
 
5
A. Dekhtyar, I. Iacob, J. Jarmczyk, K. Kiernan, N. Moore, and D. Porter. Database support for image-based Electronic Editions. In Proc. Workshop on Multimedia Information Systems (MIS'04), pages 147--156, 2004.
 
6
A. Dekhtyar and I. E. Iacob. A Framework for Management of Concurrent XML Markup. In International Workshop on XML Schema and Data Management (XSDM'03), pages 311--322. LNCS, 2003.
 
7
 
8
P. Durusau and M. O'Donnel. Declaring Trees: The Future of the Evolution of Markup? In Proc. Conference on Extreme Markup Languages, 2002.
 
9
P. Durusau and M. B. O'Donnell. Concurrent Markup for XML Documents. In Proc. XML Europe, May 2002.
 
10
K. Hawley and K. Kiernan. An image-based electronic edition of alfred the great's old english version of boethius's consolation of philosophy. In Proc., Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities (ALLC/ACH), pages 91--96, 2003.
 
11
I. E. Iacob, A. Dekhtyar, and W. Zhao. XPath Extension for Querying Concurrent XML Markup. Technical Report TR 394-04, University of Kentucky, Department of Computer Science, February 2004. http://www.cs.uky.edu/~dekhtyar/publications/ TR394-04.ps.
 
12
K. Kiernan, J. Jaromczyk, A. Dekhtyar, D. Porter, K. Hawley, S. Bodapati, and I. Iacob. The ARCHway project: Architecture for research in computing for humanities through research, teaching, and learning. Literary and Linguistic Computing, 2004. forthcoming.
 
13
A. Renear, E. Mylonas, and D. Durand. Refining our notion of what text really is: The problem of overlapping hierarchies. Research in Humanities Computing, 1993. N. Ide and S. Hockey, (Eds.).
 
14
C. M. Sperberg-McQueen and L. Burnard(Eds.). Guidelines for Text Encoding and Interchange (P4). http://www.tei-c.org/P4X/index.html, 2001. The TEI Consortium.
 
15
C. M. Sperberg-McQueen and C. Huitfeldt. GODDAG: A Data Structure for Overlapping Hierarchies, Sept. 2000. Early draft presented at the ACH-ALLC Conference in Charlottesville, June 1999.


Collaborative Colleagues:
Ionut E. Iacob: colleagues
Alex Dekhtyar: colleagues