Abstract
A lens is a bidirectional program. When read from left toright, it denotes an ordinary function that maps inputs to outputs. When read from right to left, it denotes an ''update translator'' that takes an input together with an updated output and produces a new input that reflects the update. Many variants of this idea have been explored in the literature, but none deal fully with ordered data. If, for example, an update changes the order of a list in theoutput, the items in the output list and the chunks of the input that generated them can be misaligned, leading to lost or corrupted data.
We attack this problem in the context of bidirectional transformations over strings, the primordial ordered data type. We first propose a collection of bidirectional string lens combinators, based on familiar operations on regular transducers (union, concatenation, Kleene-star) and with a type system based on regular expressions. We then design anew semantic space of dictionary lenses, enriching the lenses of Foster et al. (2007) with support for two additional combinators for marking ''reorderable chunks'' andtheir keys. To demonstrate the effectiveness of these primitives, we describe the design and implementation of Boomerang, a full-blown bidirectional programming language with dictionary lenses at its core. We have used Boomerang to build transformers for complex real-world data format sincluding the SwissProt genomic database.
We formalize the essential property of resourcefulness-the correct use of keys to associate chunks in the input and output-by defining a refined semantic space of quasi-oblivious lenses. Several previously studied properties of lenses turn out to have compact characterizations in this space.
- Artem Alimarine, Sjaak Smetsers, Arjen van Weelden, Marko van Eekelen, and Rinus Plasmeijer. There and back again: Arrows for invertible programming. In ACM SIGPLAN Workshop on Haskell, pages 86--97, 2005. Google ScholarDigital Library
- François Bancilhon and Nicolas Spyratos. Update semantics of relational views. ACM Transactions on Database Systems, 6(4):557--575, December 1981. Google ScholarDigital Library
- Nick Benton. Embedded interpreters. Journal of Functional Programming, 15(4):503--542, 2005. Google ScholarDigital Library
- Jean Berstel, Dominique Perrin, and Christophe Reutenauer. Codes and Automata. 2005. Manuscript available from http://www-igm.univ-mlv.fr/~berstel/LivreCodes/.Google Scholar
- Aaron Bohannon, Jeffrey A. Vaughan, and Benjamin C. Pierce. Relational lenses: A language for updateable views. In Principles of Database Systems (PODS), 2006. Extended version available as University of Pennsylvania technical report MS-CIS-05-27. Google ScholarDigital Library
- Aaron Bohannon, J. Nathan Foster, Benjamin C. Pierce, Alexandre Pilkiewicz, and Alan Schmitt. Boomerang: Resourceful lenses for string data. Technical Report MS-CIS-07-15, Dept. of CIS University of Pennsylvania, November 2007.Google Scholar
- Claus Brabrand, Anders Møller, and Michael I. Schwartzbach. Dual syntax for XML languages. Information Systems, 2007. To appear. Extended abstract in Database Programming Languages (DBPL) 2005. Google ScholarDigital Library
- Janusz A. Brzozowski. Derivatives of regular expressions. Journal of the ACM, 11(4):481--494, 1964. Google ScholarDigital Library
- Peter Buneman, Sanjeev Khanna, and Wang Chiew Tan. Why and where: A characterization of data provenance. In International Conference on Database Theory (ICDT), London, UK, volume 1973 of Lecture Notes in Computer Science, pages 316--330. Springer, 2001. Google ScholarDigital Library
- Yingwei Cui and Jennifer Widom. Lineage tracing for general data warehouse transformations. VLDB Journal, 12 (1):41--58, 2003. Google ScholarDigital Library
- Robert Ennals and David Gay. Multi-language synchronization. In European Symposium on Programming (ESOP), Braga, Portugal, volume 4421 of Lecture Notes in Computer Science, pages 475--489. Springer-Verlag, 2007. Google ScholarDigital Library
- Kathleen Fisher and Robert Gruber. PADS: a domain-specific language for processing ad hoc data. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Chicago, IL, pages 295--304, 2005. Google ScholarDigital Library
- J. Nathan Foster, Michael B. Greenwald, Christian Kirkegaard, Benjamin C. Pierce, and Alan Schmitt. Exploiting schemas in data synchronization. Journal of Computer and System Sciences, 73(4):669--689, June 2007a. Extended abstract in Database Programming Languages (DBPL) 2005. Google ScholarDigital Library
- J. Nathan Foster, Michael B. Greenwald, Jonathan T. Moore, Benjamin C. Pierce, and Alan Schmitt. Combinators for bi-directional tree transformations: A linguistic approach to the view update problem. ACM Transactions on Programming Languages and Systems, 29(3):17, May 2007b. Extended abstract in Principles of Programming Languages (POPL), 2005. Google ScholarDigital Library
- Zhenjiang Hu, Shin-Cheng Mu, and Masato Takeichi. A programmable editor for developing structured documents based on bi-directional transformations. In Partial Evaluation and Program Manipulation (PEPM), pages 178--189, 2004. Google ScholarDigital Library
- Shinya Kawanaka and Haruo Hosoya. bixid: a bidirectional transformation language for XML. In ACM SIGPLAN International Conference on Functional Programming (ICFP), Portland, Oregon, pages 201--214, 2006. Google ScholarDigital Library
- Andrew J. Kennedy. Functional pearl: Pickler combinators. Journal of Functional Programming, 14(6):727--739, 2004. Google ScholarDigital Library
- Lambert Meertens. Designing constraint maintainers for user interaction, 1998. Manuscript.Google Scholar
- Shin-Cheng Mu, Zhenjiang Hu, and Masato Takeichi. An algebraic approach to bi-directional updating. In ASIAN Symposium on Programming Languages and Systems (APLAS), pages 2--20, November 2004.Google ScholarCross Ref
- Benjamin C. Pierce et al. Harmony: A synchronization framework for heterogeneous tree-structured data, 2006. http://www.seas.upenn.edu/~harmony/.Google Scholar
- M. O. Rabin and D. Scott. Finite automata and their decision problems. IBM Journal of Research and Development, 3(2):114--125, 1959.Google ScholarDigital Library
- Norman Ramsey. Embedding an interpreted language using higher-order functions and types. In ACM SIGPLAN Workshop on Interpreters, Virtual Machines and Emulators (IVME), San Diego, CA, pages 6--14, 2003. Google ScholarDigital Library
- Emmanuel Roche and Yves Schabes, editors. Finite-State Language Processing. MIT Press, 1996. Google ScholarDigital Library
- Perdita Stevens. Bidirectional model transformations in QVT: Semantic issues and open questions. In International Conference on Model Driven Engineering Languages and Systems (MoDELS), Nashville, TN, volume 4735 of Lecture Notes in Computer Science, pages 1--15. Springer-Verlag, 2007. Google ScholarDigital Library
- Naoshi Tabuchi, Eijiro Sumii, and Akinori Yonezawa. Regular expression types for strings in a text processing language. In Workshop on Types in Programming (TIP), Dagstuhl, Germany, volume 75 of Electronic Notes in Theoretical Computer Science, pages 95--113, 2002.Google Scholar
Index Terms
- Boomerang: resourceful lenses for string data
Recommendations
Boomerang: resourceful lenses for string data
POPL '08: Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesA lens is a bidirectional program. When read from left toright, it denotes an ordinary function that maps inputs to outputs. When read from right to left, it denotes an ''update translator'' that takes an input together with an updated output and ...
Combinators for bidirectional tree transformations: A linguistic approach to the view-update problem
Special issue on POPL 2005We propose a novel approach to the view-update problem for tree-structured data: a domain-specific programming language in which all expressions denote bidirectional transformations on trees. In one direction, these transformations---dubbed lenses---map ...
Quotient lenses
ICFP '08There are now a number of BIDIRECTIONAL PROGRAMMING LANGUAGES, where every program can be read both as a forward transformation mapping one data structure to another and as a reverse transformation mapping an edited output back to a correspondingly ...
Comments