| XML screamer: an integrated approach to high performance XML parsing, validation and deserialization |
| Full text |
Pdf
(304 KB)
|
| Source
|
International World Wide Web Conference
archive
Proceedings of the 15th international conference on World Wide Web
table of contents
Edinburgh, Scotland
Pages: 93 - 102
Year of Publication: 2006
ISBN:1-59593-323-9
|
|
Authors
|
|
Margaret G. Kostoulas
|
IBM Corporation, Cambridge, MA
|
|
Morris Matsa
|
IBM Corporation, Cambridge, MA
|
|
Noah Mendelsohn
|
IBM Corporation, Cambridge, MA
|
|
Eric Perkins
|
IBM Corporation, Cambridge, MA
|
|
Abraham Heifets
|
IBM Corporation, Cambridge, MA
|
|
Martha Mercaldi
|
University of Washington
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 12, Downloads (12 Months): 121, Citation Count: 6
|
|
|
ABSTRACT
This paper describes an experimental system in which customized high performance XML parsers are prepared using parser generation and compilation techniques. Parsing is integrated with Schema-based validation and deserialization, and the resulting validating processors are shown to be as fast as or in many cases significantly faster than traditional nonvalidating parsers. High performance is achieved by integration across layers of software that are traditionally separate, by avoiding unnecessary data copying and transformation, and by careful attention to detail in the generated code. The effect of API design on XML performance is also briefly discussed..
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
The Apache Software Foundation. Apache Xerces. http://xml.apache.org.
|
| |
2
|
The Castor Project. http://www.castor.org/index.html.
|
| |
3
|
Chiu, K. and Lu, W. A compiler-based approach to schema-specific XML parsing. In First International Workshop on High Performance XML Processing (May 2004).
|
| |
4
|
Datapower, Inc. XA35 XML Accelerator. http://www.datapower.com/products/xa35.html.
|
| |
5
|
The GNU Project. Flex. http://www.gnu.org/software/flex/.
|
| |
6
|
Java API for XML-Based RPC (JAX-RPC). http://java.sun.com/webservices/jaxrpc/index.jsp.
|
| |
7
|
Kuznetsov, E. Method and Apparatus of Data Exchange Using Runtime Code Generator and Translator. US Patent 6,772,413 B2, (August, 2004) .
|
| |
8
|
|
| |
9
|
Liquid Technologies Liquid XML 2005. http://www.liquid-technologies.com/.
|
| |
10
|
Microsoft Corp. MSXML Parser. http://msdn.microsoft.com/xml/.
|
| |
11
|
|
| |
12
|
Reuter, F. and Luttenberger. N. Cardinality constraint automata: A core technology for efficient XML schema-aware parsers. (2003) http://www.swarms.de/publications/cca.pdf.
|
| |
13
|
Sarvega, Inc. XML Validation Benchmark. http://www.sarvega.com/xml-validation-benchmark.html.
|
| |
14
|
Sarvega, Inc. The Sarvega Speedwayö XSLT Accelerator. http://www.sarvega.com/xml-speedway-accelerator.html.
|
| |
15
|
Saxonica, Ltd. Saxon. http://www.saxonica.com/.
|
 |
16
|
Toshiro Takase , Hisashi MIYASHITA , Toyotaro Suzumura , Michiaki Tatsubori, An adaptive, fast, and safe XML parser based on byte sequences memorization, Proceedings of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan
[doi> 10.1145/1060745.1060845]
|
| |
17
|
Thompson, H. S. and Tobin, R. Current status of XSV. (22 April 2005) http://www.ltg.ed.ac.uk/~ht/xsv-status.html.
|
| |
18
|
|
| |
19
|
van Engelen, R. Constructing Finite State Automata for High-Performance XML Web Services. International Conference on Internet Computing (2004): 975--981.
|
 |
20
|
|
| |
21
|
W3C. Extensible Markup Language, Version 1.0. 4 February 2004. http://www.w3.org/TR/REC-xml/.
|
| |
22
|
W3C. Namespaces in XML. 14 January 1999. http://www.w3.org/TR/ REC-xml-names/.
|
| |
23
|
W3C. XML Schema, W3C Recommendation, 28 October 2004. Part 1: http://www.w3.org/TR/ xmlschema-1/, Part 2: http://www.w3.org/TR/xmlschema-2/, Primer: http://www.w3.org/TR/xmlschema-0/.
|
| |
24
|
W3C. XML Schema API, W3C Member Submission, March 2004, http://xml.apache.org/xerces2-j/api.html.
|
CITED BY 6
|
|
|
Morris Matsa , Eric Perkins , Abraham Heifets , Margaret Gaitatzes Kostoulas , Daniel Silva , Noah Mendelsohn , Michelle Leger, A high-performance interpretive approach to schema-directed parsing, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
Jaakko Kangasharju , Sasu Tarkoma, Benefits of alternate XML serialization formats in scientific computing, Proceedings of the 2007 workshop on Service-oriented computing performance: aspects, issues, and approaches, p.23-30, June 25-25, 2007, Monterey, California, USA
|
|
|
|
INDEX TERMS
Primary Classification:
D.
Software
D.3
PROGRAMMING LANGUAGES
D.3.4
Processors
Subjects:
Code generation
Additional Classification:
D.
Software
D.2
SOFTWARE ENGINEERING
D.2.8
Metrics
Subjects:
Performance measures
D.3
PROGRAMMING LANGUAGES
D.3.4
Processors
Subjects:
Compilers;
Parsing;
Optimization;
Retargetable compilers
I.
Computing Methodologies
I.7
DOCUMENT AND TEXT PROCESSING
I.7.2
Document Preparation
General Terms:
Experimentation,
Languages,
Performance,
Standardization
Keywords:
JAX-RPC,
SAX,
XML,
XML schema,
parsing,
performance,
schema compilation,
validation
|