ABSTRACT
Often, independent organizations define and advocate different XML formats for a similar purpose and, as a result, application programs need to mutually convert between such formats. Existng XML transformation languages, such as XSLT and XDuce, are unsatisfactory for this purpose since we would have to write, e.g., two programs for the forward and the backward transformations in case of two formats, incur high developing and maintenance costs.This paper proposes the bidirectional XML transformation language biXid, allowing us to write only one program for both directions of conversion. Our language adopts a common paradigm programming-by-relation, where a program defines a relation over documents and transforms a document to another in a way satisfying this relation. Our contributions here are specific language features for facilitating realistic conversions whose target formats are loosely in parallel but have many discrepancies in details. Concretely, we (1) adopt XDuce-style regular expression patterns for describing and analyzing XML structures, (2) fully permit ambiguity for treating formats that do not have equivalent expressivenesses, and (3) allow non-linear pattern variables for expressing non-trivial transformations that cannot be written only with linear patterns, such as conversion between unordered and ordered data.We further develop an efficient evaluation algorithm for biXid, consisting of the "parsing" phase that transforms the input document to an intermediate "parse tree" structure and the "unparsing" phase that transforms it to an output document. Both phases use a variant of finite tree automata for performing a one-pass scan on the input or the parse tree by using a standard technique that "maintains the set of all transitable states." However, the construction of the "unparsing" phase is challenging since ambiguity causes different ways of consuming the parse tree and thus results in multiple possible outputs that may have different structures.We have implemented a prototype system of biXid and confirmed that it has enough expressiveness and a linear-time performance from experiments with several realistic bidirectional transformations including one between vCard-XML and ContactXML.
- V. Benzaken, G. Castagna, and A. Frisch. CDuce: An XML-centric general-purpose language. In Proceedings of the International Conference on Functional Programming (ICFP), pages 51--63, 2003. Google ScholarDigital Library
- R. Bourret. XML data binding resources, 2001. http://www.rpbourret.com/xml/XMLDataBinding.htm.Google Scholar
- C. Brabrand, A. Møller, and M.I. Schwartzbach. Dual syntax for XML languages. In Proc. 10th International Workshop on Database Programming Languages, DBPL '05, volume 3774 of LNCS, pages 27--41. Springer-Verlag, August 2005. Google ScholarDigital Library
- J. Clark. XSL Transformations (XSLT), 1999. http://www.w3.org/TR/xslt.Google Scholar
- J. Clark and M. Murata. RELAX NG, 2001. http://www.relaxng.org.Google Scholar
- J. Coelho and M. Florido. Type-based XML processing in logic programming. In PADL, pages 273--285, 2003. Google ScholarDigital Library
- ContactXML Users Group. ContactXML, 2000. http://www.contactxml.org/.Google Scholar
- R. Elmasri and S. B. Navathe. Fundamentals of Database Systems, 2nd Edition. Benjamin/Cummings, 1994. Google ScholarDigital Library
- D. C. Fallside. XML Schema Part 0: Primer, W3C Recommendation, 2001. http://www.w3.org/TR/xmlschema-0/.Google Scholar
- J.N. Foster, M.B. Greenwald, J.T. Moore, B.C. Pierce, and A. Schmitt. Combinators for bi-directional tree transformations: A linguistic approach to the view update problem. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 233--246, 2005. Google ScholarDigital Library
- V.B. Gundersen and Z.W. Hendrikse. BibTeX as XML markup, 2005. http://bibtexml.sourceforge.net/index.html.Google Scholar
- H. Hosoya. Regular expression pattern matching - a simpler design. Technical Report 1397, RIMS, Kyoto University, 2003.Google Scholar
- H. Hosoya and M. Murata. Boolean operations and inclusion test for attribute-element constraints. In Eighth International Conference on Implementation and Application of Automata, volume 2759 of Lecture Notes in Computer Science, pages 201--212. Springer-Verlag, 2003. Google ScholarDigital Library
- H. Hosoya and B.C. Pierce. Regular expression pattern matching for XML. In The 25th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 67--80, Jan. 2001. Google ScholarDigital Library
- H. Hosoya and B.C. Pierce. XDuce: A typed XML processing language. ACM Transactions on Internet Technology, 3(2):117--148, 2003. Short version appeared in Proceedings of Third International Workshop on the Web and Databases (WebDB2000), volume 1997 of Lecture Notes in Computer Science, pp. 226--244, Springer-Verlag. Google ScholarDigital Library
- H. Hosoya, J. Vouillon, and B.C. Pierce. Regular expression types for XML. ACM Transactions on Programming Languages and Systems, 27(1):46--90, 2004. Short version appeared in Proceedings of the International Conference on Functional Programming (ICFP), pp. 11--22, 2000. Google ScholarDigital Library
- R. Iannella. Representing vCard objects in RDF/XML, 2001. http://www.w3.org/TR/vcard-rdf.Google Scholar
- S.-C. Mu, Z. Hu, and M. Takeichi. An algebraic approach to bi-directional updating. In APLAS, pages 2--20, 2004.Google ScholarCross Ref
- M. Murata, D. Lee, and M. Mani. Taxonomy of XML schema languages using formal language theory. In Extreme Markup Languages, 2001.Google Scholar
- Python XML Special Interest Group. The XML bookmark exchange language, 1998. http://pyxml.sourceforge. net/topics/xbel/.Google Scholar
- L. Sterling and E. Y. Shapiro. The Art of Prolog-Advanced Programming Techniques. MIT Press, 1986. Google ScholarDigital Library
- E. Wilde. BibTeXML (BibTeX markup language), 2004. http://dret. net/projects/bibtexml/.Google Scholar
Index Terms
- biXid: a bidirectional transformation language for XML
Recommendations
biXid: a bidirectional transformation language for XML
Proceedings of the 2006 ICFP conferenceOften, independent organizations define and advocate different XML formats for a similar purpose and, as a result, application programs need to mutually convert between such formats. Existng XML transformation languages, such as XSLT and XDuce, are ...
Mapping of bibliographical standards into XML
The most popular bibliographical standards, which prescribe the exchange of bibliographical data in machine readable form, are MARC (Machine Readable Cataloguing) and UNIMARC (Universal Machine Readable Cataloguing). This paper presents two schemas, ...
Mapping DTDs to relational schemas with semantic constraints
XML is becoming a prevalent format and standard for data exchange in many applications. With the increase of XML data, there is an urgent need to research some efficient methods to store and manage XML data. As relational databases are the primary ...
Comments