ACM Home Page
Please provide us with feedback. Feedback
Engineering succinct DOM
Full text PdfPdf (694 KB)
Source ACM International Conference Proceeding Series; Vol. 261 archive
Proceedings of the 11th international conference on Extending database technology: Advances in database technology table of contents
Nantes, France
SESSION: Research sessions: XML table of contents
Pages 49-60  
Year of Publication: 2008
ISBN:978-1-59593-926-5
Authors
O'Neil Delpratt  University of Leicester, Leicester, UK
Rajeev Raman  University of Leicester, Leicester, UK
Naila Rahman  University of Leicester, Leicester, UK
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 65,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1353343.1353354
What is a DOI?

ABSTRACT

We describe the engineering of Succinct DOM (SDOM), a DOM implementation, written in C++, which is suitable for in-memory representation of large static XML documents. SDOM avoids the use of pointers, and is based upon succinct data structures, which use an information-theoretically minimum amount of space to represent an object.

SDOM gives a space-efficient in-memory representation, with stable and predictable memory usage. The space used by SDOM is an order of magnitude less than that used by a standard C++ DOM representation such as Xerces, but SDOM is extremely fast: navigation is in some cases faster than for a pointer-based representation such as Xerces (even for moderate-sized documents which can comfortably be loaded into main memory by Xerces).

A variant, SDOM-CT, applies bzip-based compression to textual and attribute data, and its space usage is comparable with "queryable" XML compressors. Some of these compressors support navigation and/or querying (e.g. subpath queries) of the compressed file. SDOM-CT does not support querying directly, but remains extremely fast: it is several orders of magnitude faster for navigation than queryable XML compressors that support navigation (and only a few times slower than say Xerces).


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
BZip2. http://www.bzip.org.
 
4
Busatto, G., Lohrey, M., and Maneth, S. 2005. Efficient memory representation of XML documents. In Proc. Database Programming Languages, 10th International Symposium, DBPL 2005. Springer LNCS 3774, pp. 199--216. doi:10.1007/11601524_13
 
5
Cheng, J., Ng, W., 2004. XQzip: Querying compressed XML using structural indexing. In Proc. 9th International Conference on Extending Database Technology, EDBT '04. Springer LNCS 2992, pp. 219--236. doi:10.1007/b95855.
 
6
 
7
Delpratt, D., Rahman, N., and Raman, R. 2007. Compressed prefix sums. In Proc. Theory and Practice of Computer Science, SOFSEM '07. Springer LNCS 4362, pp. 235--247. doi:10.1007/978-3-540-69507-3_19.
 
8
Delpratt, D., Rahman, N., and Raman, R. 2006. Engineering the LOUDS succinct tree representation. In Proc.5th Workshop on Experimental Algorithms, WEA '06. Springer LNCS 4007, pp. 134--145. doi:10.1007/11764298_12.
 
9
10
 
11
 
12
Jacobson, G. 1989. Space-efficient static trees and graphs. In Proc. 30th Symp. Foundations of Computer Science, FOCS '89, pp. 549--554. doi:10.1109/SFCS.1989.63533
 
13
Kim, D. K., Na, J. C., Kim, J. E., and Park, K. 2005. Efficient implementation of rank and select functions for succinct representation. In Proc. 4th Workshop on Experimental Algorithms, WEA 2005. Springer LNCS 3503, pp. 315--327. doi:10.1007/11427186_28.
14
15
 
16
 
17
 
18
 
19
Raman, R. and Rao, S. S. 2003. Succinct dynamic dictionaries and trees. In Proc. ICALP 2003, Springer LNCS 2719, pp. 357--368.
 
20
Saxon. http://saxon.source.forge.net/
 
21
 
22
Xerces C++ Parser. http://xerces.apache.org/xerces-c/
 
23
Xerces Java 2 Parser. http://xerces.apache.org/xerces2-j/
 
24
UW XML Repository. http://www.cs.washingon.edu/research/xmldatasets/www/repository.html
 
25
VOTable Documentation. http://www.us-vo.org/VOTable/
 
26
W3C DOM API documentation. 2004. http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/
 
27
W3C DOM Traversal, 2000. http://www.w3.org/TR/DOM-Level-2-Traversal-Range/traversal.html
 
28
Collaborative Colleagues:
O'Neil Delpratt: colleagues
Rajeev Raman: colleagues
Naila Rahman: colleagues