skip to main content
research-article

Boomerang: resourceful lenses for string data

Published:07 January 2008Publication History
Skip Abstract Section

Abstract

A lens is a bidirectional program. When read from left toright, it denotes an ordinary function that maps inputs to outputs. When read from right to left, it denotes an ''update translator'' that takes an input together with an updated output and produces a new input that reflects the update. Many variants of this idea have been explored in the literature, but none deal fully with ordered data. If, for example, an update changes the order of a list in theoutput, the items in the output list and the chunks of the input that generated them can be misaligned, leading to lost or corrupted data.

We attack this problem in the context of bidirectional transformations over strings, the primordial ordered data type. We first propose a collection of bidirectional string lens combinators, based on familiar operations on regular transducers (union, concatenation, Kleene-star) and with a type system based on regular expressions. We then design anew semantic space of dictionary lenses, enriching the lenses of Foster et al. (2007) with support for two additional combinators for marking ''reorderable chunks'' andtheir keys. To demonstrate the effectiveness of these primitives, we describe the design and implementation of Boomerang, a full-blown bidirectional programming language with dictionary lenses at its core. We have used Boomerang to build transformers for complex real-world data format sincluding the SwissProt genomic database.

We formalize the essential property of resourcefulness-the correct use of keys to associate chunks in the input and output-by defining a refined semantic space of quasi-oblivious lenses. Several previously studied properties of lenses turn out to have compact characterizations in this space.

References

  1. Artem Alimarine, Sjaak Smetsers, Arjen van Weelden, Marko van Eekelen, and Rinus Plasmeijer. There and back again: Arrows for invertible programming. In ACM SIGPLAN Workshop on Haskell, pages 86--97, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. François Bancilhon and Nicolas Spyratos. Update semantics of relational views. ACM Transactions on Database Systems, 6(4):557--575, December 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nick Benton. Embedded interpreters. Journal of Functional Programming, 15(4):503--542, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jean Berstel, Dominique Perrin, and Christophe Reutenauer. Codes and Automata. 2005. Manuscript available from http://www-igm.univ-mlv.fr/~berstel/LivreCodes/.Google ScholarGoogle Scholar
  5. Aaron Bohannon, Jeffrey A. Vaughan, and Benjamin C. Pierce. Relational lenses: A language for updateable views. In Principles of Database Systems (PODS), 2006. Extended version available as University of Pennsylvania technical report MS-CIS-05-27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Aaron Bohannon, J. Nathan Foster, Benjamin C. Pierce, Alexandre Pilkiewicz, and Alan Schmitt. Boomerang: Resourceful lenses for string data. Technical Report MS-CIS-07-15, Dept. of CIS University of Pennsylvania, November 2007.Google ScholarGoogle Scholar
  7. Claus Brabrand, Anders Møller, and Michael I. Schwartzbach. Dual syntax for XML languages. Information Systems, 2007. To appear. Extended abstract in Database Programming Languages (DBPL) 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Janusz A. Brzozowski. Derivatives of regular expressions. Journal of the ACM, 11(4):481--494, 1964. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Peter Buneman, Sanjeev Khanna, and Wang Chiew Tan. Why and where: A characterization of data provenance. In International Conference on Database Theory (ICDT), London, UK, volume 1973 of Lecture Notes in Computer Science, pages 316--330. Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yingwei Cui and Jennifer Widom. Lineage tracing for general data warehouse transformations. VLDB Journal, 12 (1):41--58, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Robert Ennals and David Gay. Multi-language synchronization. In European Symposium on Programming (ESOP), Braga, Portugal, volume 4421 of Lecture Notes in Computer Science, pages 475--489. Springer-Verlag, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kathleen Fisher and Robert Gruber. PADS: a domain-specific language for processing ad hoc data. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Chicago, IL, pages 295--304, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Nathan Foster, Michael B. Greenwald, Christian Kirkegaard, Benjamin C. Pierce, and Alan Schmitt. Exploiting schemas in data synchronization. Journal of Computer and System Sciences, 73(4):669--689, June 2007a. Extended abstract in Database Programming Languages (DBPL) 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Nathan Foster, Michael B. Greenwald, Jonathan T. Moore, Benjamin C. Pierce, and Alan Schmitt. Combinators for bi-directional tree transformations: A linguistic approach to the view update problem. ACM Transactions on Programming Languages and Systems, 29(3):17, May 2007b. Extended abstract in Principles of Programming Languages (POPL), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Zhenjiang Hu, Shin-Cheng Mu, and Masato Takeichi. A programmable editor for developing structured documents based on bi-directional transformations. In Partial Evaluation and Program Manipulation (PEPM), pages 178--189, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Shinya Kawanaka and Haruo Hosoya. bixid: a bidirectional transformation language for XML. In ACM SIGPLAN International Conference on Functional Programming (ICFP), Portland, Oregon, pages 201--214, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Andrew J. Kennedy. Functional pearl: Pickler combinators. Journal of Functional Programming, 14(6):727--739, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lambert Meertens. Designing constraint maintainers for user interaction, 1998. Manuscript.Google ScholarGoogle Scholar
  19. Shin-Cheng Mu, Zhenjiang Hu, and Masato Takeichi. An algebraic approach to bi-directional updating. In ASIAN Symposium on Programming Languages and Systems (APLAS), pages 2--20, November 2004.Google ScholarGoogle ScholarCross RefCross Ref
  20. Benjamin C. Pierce et al. Harmony: A synchronization framework for heterogeneous tree-structured data, 2006. http://www.seas.upenn.edu/~harmony/.Google ScholarGoogle Scholar
  21. M. O. Rabin and D. Scott. Finite automata and their decision problems. IBM Journal of Research and Development, 3(2):114--125, 1959.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Norman Ramsey. Embedding an interpreted language using higher-order functions and types. In ACM SIGPLAN Workshop on Interpreters, Virtual Machines and Emulators (IVME), San Diego, CA, pages 6--14, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Emmanuel Roche and Yves Schabes, editors. Finite-State Language Processing. MIT Press, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Perdita Stevens. Bidirectional model transformations in QVT: Semantic issues and open questions. In International Conference on Model Driven Engineering Languages and Systems (MoDELS), Nashville, TN, volume 4735 of Lecture Notes in Computer Science, pages 1--15. Springer-Verlag, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Naoshi Tabuchi, Eijiro Sumii, and Akinori Yonezawa. Regular expression types for strings in a text processing language. In Workshop on Types in Programming (TIP), Dagstuhl, Germany, volume 75 of Electronic Notes in Theoretical Computer Science, pages 95--113, 2002.Google ScholarGoogle Scholar

Index Terms

  1. Boomerang: resourceful lenses for string data

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 43, Issue 1
      POPL '08
      January 2008
      420 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1328897
      Issue’s Table of Contents
      • cover image ACM Conferences
        POPL '08: Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
        January 2008
        448 pages
        ISBN:9781595936899
        DOI:10.1145/1328438

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 January 2008

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader