skip to main content
10.1145/1863543.1863585acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

Total parser combinators

Published:27 September 2010Publication History

ABSTRACT

A monadic parser combinator library which guarantees termination of parsing, while still allowing many forms of left recursion, is described. The library's interface is similar to those of many other parser combinator libraries, with two important differences: one is that the interface clearly specifies which parts of the constructed parsers may be infinite, and which parts have to be finite, using dependent types and a combination of induction and coinduction; and the other is that the parser type is unusually informative.

The library comes with a formal semantics, using which it is proved that the parser combinators are as expressive as possible. The implementation is supported by a machine-checked correctness proof.

Skip Supplemental Material Section

Supplemental Material

icfp-weds-1215-danielsson.mov

mov

113.8 MB

References

  1. }}The Agda Team. The Agda Wiki. Available at http://wiki.portal.chalmers.se/agda/, 2010.Google ScholarGoogle Scholar
  2. }}Thorsten Altenkirch and Nils Anders Danielsson. Termination checking in the presence of nested inductive and coinductive types. Note supporting presentation given at the Workshop on Partiality and Recursion in Interactive Theorem Provers, Edinburgh, UK, 2010.Google ScholarGoogle Scholar
  3. }}Arthur Baars, S. Doaitse Swierstra, and Marcos Viera. Typed transformations of typed grammars: The left corner transform. In Preliminary Proceedings of the Ninth Workshop on Language Descriptions Tools and Applications, LDTA 2009, pages 18--33, 2009.Google ScholarGoogle Scholar
  4. }}Marcello Bonsangue, Jan Rutten, and Alexandra Silva. A Kleene theorem for polynomial coalgebras. In Foundations of Software Science and Computational Structures, 12th International Conference, FOSSACS 2009, volume 5504 of LNCS, pages 122--136, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. }}Kasper Brink, Stefan Holdermans, and Andres Löh. Dependently typed grammars. In Mathematics of Program Construction, Tenth International Conference, MPC 2010, volume 6120 of LNCS, pages 58--79, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. }}Janusz A. Brzozowski. Derivatives of regular expressions. Journal of the ACM, 11 (4): 481--494, 1964. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. }}William H. Burge. Recursive Programming Techniques. Addison-Wesley, 1975.Google ScholarGoogle Scholar
  8. }}Magnus Carlsson and Thomas Hallgren. Fudgets - Purely Functional Processes with applications to Graphical User Interfaces. PhD thesis, Chalmers University of Technology and Göteborg University, 1998.Google ScholarGoogle Scholar
  9. }}Koen Claessen. Embedded Languages for Describing and Verifying Hardware. PhD thesis, Chalmers University of Technology, 2001.Google ScholarGoogle Scholar
  10. }}Koen Claessen. Parallel parsing processes. Journal of Functional Programming, 14: 741--757, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. }}Koen Claessen and David Sands. Observable sharing for functional circuit description. In Advances in Computing Science - ASIAN'99, volume 1742 of LNCS, pages 62--73, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. }}Thierry Coquand. Infinite objects in type theory. In Types for Proofs and Programs, International Workshop TYPES '93, volume 806 of LNCS, pages 62--78, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. }}Nils Anders Danielsson. Beating the productivity checker using embedded languages. In Workshop on Partiality and Recursion in Interactive Theorem Provers, Edinburgh, UK, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  14. }}Nils Anders Danielsson and Thorsten Altenkirch. Subtyping, de­clar­a­tive­ly: An exercise in mixed induction and coinduction. In Mathematics of Program Construction, Tenth International Conference, MPC 2010, volume 6120 of LNCS, pages 100--118, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. }}Nils Anders Danielsson and Ulf Norell. Structurally recursive descent parsing. Unpublished note, 2008.Google ScholarGoogle Scholar
  16. }}Nils Anders Danielsson and Ulf Norell. Parsing mixfix operators. To appear in the proceedings of the 20th International Symposium on the Implementation and Application of Functional Languages (IFL 2008), 2009.Google ScholarGoogle Scholar
  17. }}Jon Fairbairn. Making form follow function: An exercise in functional programming style. Software: Practice and Experience, 17 (6): 379--386, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. }}Jeroen Fokker. Functional parsers. In Advanced Functional Programming, volume 925 of LNCS, pages 1--23, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. }}Richard A. Frost, Rahmatullah Hafiz, and Paul Callaghan. Parser combinators for ambiguous left-recursive grammars. In PADL 2008: Practical Aspects of Declarative Languages, volume 4902 of LNCS, pages 167--181, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. }}Tatsuya Hagino. A Categorical Programming Language. PhD thesis, University of Edinburgh, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. }}Peter Hancock, Dirk Pattinson, and Neil Ghani. Representations of stream processors using nested fixed points. Logical Methods in Computer Science, 5 (3:9), 2009.Google ScholarGoogle ScholarCross RefCross Ref
  22. }}R. John M. Hughes and S. Doaitse Swierstra. Polish parsers, step by step. In ICFP '03: Proceedings of the eighth ACM SIGPLAN international conference on Functional programming, pages 239--248, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. }}Graham Hutton. Higher-order functions for parsing. Journal of Functional Programming, 2: 323--343, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  24. }}Mark Johnson. Memoization in top-down parsing. Computational Linguistics, 21 (3): 405--417, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. }}Oleg Kiselyov. Parsec-like parser combinator that handles left recursion? Message to the Haskell-Cafe mailing list, December 2009.Google ScholarGoogle Scholar
  26. }}Pieter Koopman and Rinus Plasmeijer. Efficient combinator parsers. In IFL'98: Implementation of Functional Languages, volume 1595 of LNCS, pages 120--136, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. }}Adam Koprowski and Henri Binsztok. TRX: A formally verified parser interpreter. In Programming Languages and Systems, 19th European Symposium on Programming, ESOP 2010, volume 6012 of LNCS, pages 345--365, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. }}Dexter Kozen. On Kleene algebras and closed semirings. In Mathematical Foundations of Computer Science 1990, volume 452 of LNCS, pages 26--47, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. }}Daan Leijen and Erik Meijer. Parsec: Direct style monadic parser combinators for the real world. Technical Report UU-CS-2001-35, Department of Information and Computing Sciences, Utrecht University, 2001.Google ScholarGoogle Scholar
  30. }}Paul Lickman. Parsing with fixed points. Master's thesis, University of Cambridge, 1995.Google ScholarGoogle Scholar
  31. }}Peter Ljunglöf. Pure functional parsing; an advanced tutorial. Licentiate thesis, Department of Computing Science, Chalmers University of Technology and Göteborg University, 2002.Google ScholarGoogle Scholar
  32. }}Antoni W. Mazurkiewicz. A note on enumerable grammars. Information and Control, 14 (6): 555--558, 1969.Google ScholarGoogle ScholarCross RefCross Ref
  33. }}Conor McBride and James McKinna. Seeing and doing. Presentation (given by McBride) at the Workshop on Termination and Type Theory, Hindås, Sweden, 2002.Google ScholarGoogle Scholar
  34. }}Conor McBride and Ross Paterson. Applicative programming with effects. Journal of Functional Programming, 18: 1--13, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. }}Erik Meijer. Calculating Compilers. PhD thesis, Nijmegen University, 1992.Google ScholarGoogle Scholar
  36. }}Paul Francis Mendler. Inductive Definition in Type Theory. PhD thesis, Cornell University, 1988.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. }}Muad‘Dib. Strongly specified parser combinators. Post to the Muad‘Dib blog, 2009.Google ScholarGoogle Scholar
  38. }}Ulf Norell. Towards a practical programming language based on dependent type theory. PhD thesis, Chalmers University of Technology and Göteborg University, 2007.Google ScholarGoogle Scholar
  39. }}David Park. On the semantics of fair parallelism. In Abstract Software Specifications, volume 86 of LNCS, pages 504--526, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. }}Frank Pfenning. Unification and anti-unification in the calculus of constructions. In Proceedings of the Sixth Annual IEEE Symposium on Logic in Computer Science, pages 74--85, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  41. }}J. J. M. M. Rutten. Automata and coinduction (an exercise in coalgebra). In CONCUR'98, Concurrency Theory, 9th International Conference, volume 1466 of LNCS, pages 547--554, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. }}Niklas Röjemo. Garbage collection, and memory efficiency, in lazy functional languages. PhD thesis, Chalmers University of Technology and University of Göteborg, 1995.Google ScholarGoogle Scholar
  43. }}Marvin Solomon. Theoretical Issues in the Implementation of Programming Languages. PhD thesis, Cornell University, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. }}S. Doaitse Swierstra and Luc Duponcheel. Deterministic, error-correcting combinator parsers. In Advanced Functional Programming, volume 1129 of LNCS, pages 184--207, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. }}Wouter Swierstra. A Hoare logic for the state monad. In Theorem Proving in Higher Order Logics, 22nd International Conference, TPHOLs 2009, volume 5674 of LNCS, pages 440--451, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. }}Philip Wadler. How to replace failure by a list of successes; a method for exception handling, backtracking, and pattern matching in lazy functional languages. In Functional Programming Languages and Computer Architecture, volume 201 of LNCS, pages 113--128, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. }}Philip Wadler, Walid Taha, and David MacQueen. How to add laziness to a strict language, without even being odd. In Proceedings of the 1998 ACM SIGPLAN Workshop on ML, 1998.Google ScholarGoogle Scholar
  48. }}Malcolm Wallace. Partial parsing: Combining choice with commitment. In IFL 2007: Implementation and Application of Functional Languages, volume 5083 of LNCS, pages 93--110, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Total parser combinators

                    Recommendations

                    Reviews

                    Simon John Thompson

                    Parser combinators are one of the best-known examples of the advantages of functional programming, as they make essential use of functions as arguments and results of functions. They are also one of the longest standing examples, going back to Burge's groundbreaking 1975 work [1]. In this model, parsers are represented as functions, which partially consume input streams and return parse trees. Parser combinators directly represent grammar operations like alternation, repetition, and sequencing; complex parsers are built by applying these combinators to simpler ones, generally in a recursive definition. With recursion comes the possibility"?indeed probability"?of nontermination. Dependent types have come into functional programming from constructive mathematics, particularly from the work of Per Martin-L?f, which also began in the 1970s. Dependent types are types that depend on values, and this simple mechanism establishes a correspondence between types and logical predicates, and between the values those types contain and proofs of the predicates. Dependent types thus have the power to express constraints far more fine grained than traditional type systems. For the programming-logic correspondence to work properly, the logic needs to be consistent: that is, not everything should be provable. To do this, general recursion needs to be banned, and only terminating forms of recursion and co-recursion can be permitted; this is precisely what the Agda language, in which the work of this paper is implemented, supports. Terminating recursion is straightforward: progress towards termination can be measured by the arguments shrinking. Co-recursion is less familiar, and is dual to recursion: instead of consuming arguments, co-recursion generates results, and for co-recursion to be acceptable, it needs to be "productive"?; that is, co-recursively defined objects, like infinite lists, need to be unfoldable indefinitely. There are several contributions of this paper. First, it gives an implementation of parser combinators in a total language, in a way that supports left recursion. For this to work, it is essential that certain parsers must not accept the empty string, and this constraint can be expressed in dependent types (it's a simple predicate, after all). More generally, it gives a tutorial example in combining recursion"?for grammatical structure"?and co-recursion"?for controlled recursive grammar definitions. The author concludes the paper with a persuasive argument that this approach gives a framework for understanding the behavior of lazy programming in languages like Haskell, where recursion and co-recursion overlap. This well-written paper contains a particularly useful survey of related work right at the start, helping to contextualize the contribution. It should be essential reading for anyone interested in parsing and advances in dependently typed functional programming. Online Computing Reviews Service

                    Access critical reviews of Computing literature here

                    Become a reviewer for Computing Reviews.

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in
                    • Published in

                      cover image ACM Conferences
                      ICFP '10: Proceedings of the 15th ACM SIGPLAN international conference on Functional programming
                      September 2010
                      398 pages
                      ISBN:9781605587943
                      DOI:10.1145/1863543
                      • cover image ACM SIGPLAN Notices
                        ACM SIGPLAN Notices  Volume 45, Issue 9
                        ICFP '10
                        September 2010
                        382 pages
                        ISSN:0362-1340
                        EISSN:1558-1160
                        DOI:10.1145/1932681
                        Issue’s Table of Contents

                      Copyright © 2010 ACM

                      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      • Published: 27 September 2010

                      Permissions

                      Request permissions about this article.

                      Request Permissions

                      Check for updates

                      Qualifiers

                      • research-article

                      Acceptance Rates

                      Overall Acceptance Rate333of1,064submissions,31%

                      Upcoming Conference

                      ICFP '24

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader

                    ePub

                    View this article in ePub.

                    View ePub