ABSTRACT
A monadic parser combinator library which guarantees termination of parsing, while still allowing many forms of left recursion, is described. The library's interface is similar to those of many other parser combinator libraries, with two important differences: one is that the interface clearly specifies which parts of the constructed parsers may be infinite, and which parts have to be finite, using dependent types and a combination of induction and coinduction; and the other is that the parser type is unusually informative.
The library comes with a formal semantics, using which it is proved that the parser combinators are as expressive as possible. The implementation is supported by a machine-checked correctness proof.
Supplemental Material
- }}The Agda Team. The Agda Wiki. Available at http://wiki.portal.chalmers.se/agda/, 2010.Google Scholar
- }}Thorsten Altenkirch and Nils Anders Danielsson. Termination checking in the presence of nested inductive and coinductive types. Note supporting presentation given at the Workshop on Partiality and Recursion in Interactive Theorem Provers, Edinburgh, UK, 2010.Google Scholar
- }}Arthur Baars, S. Doaitse Swierstra, and Marcos Viera. Typed transformations of typed grammars: The left corner transform. In Preliminary Proceedings of the Ninth Workshop on Language Descriptions Tools and Applications, LDTA 2009, pages 18--33, 2009.Google Scholar
- }}Marcello Bonsangue, Jan Rutten, and Alexandra Silva. A Kleene theorem for polynomial coalgebras. In Foundations of Software Science and Computational Structures, 12th International Conference, FOSSACS 2009, volume 5504 of LNCS, pages 122--136, 2009. Google ScholarDigital Library
- }}Kasper Brink, Stefan Holdermans, and Andres Löh. Dependently typed grammars. In Mathematics of Program Construction, Tenth International Conference, MPC 2010, volume 6120 of LNCS, pages 58--79, 2010. Google ScholarDigital Library
- }}Janusz A. Brzozowski. Derivatives of regular expressions. Journal of the ACM, 11 (4): 481--494, 1964. Google ScholarDigital Library
- }}William H. Burge. Recursive Programming Techniques. Addison-Wesley, 1975.Google Scholar
- }}Magnus Carlsson and Thomas Hallgren. Fudgets - Purely Functional Processes with applications to Graphical User Interfaces. PhD thesis, Chalmers University of Technology and Göteborg University, 1998.Google Scholar
- }}Koen Claessen. Embedded Languages for Describing and Verifying Hardware. PhD thesis, Chalmers University of Technology, 2001.Google Scholar
- }}Koen Claessen. Parallel parsing processes. Journal of Functional Programming, 14: 741--757, 2004. Google ScholarDigital Library
- }}Koen Claessen and David Sands. Observable sharing for functional circuit description. In Advances in Computing Science - ASIAN'99, volume 1742 of LNCS, pages 62--73, 1999. Google ScholarDigital Library
- }}Thierry Coquand. Infinite objects in type theory. In Types for Proofs and Programs, International Workshop TYPES '93, volume 806 of LNCS, pages 62--78, 1994. Google ScholarDigital Library
- }}Nils Anders Danielsson. Beating the productivity checker using embedded languages. In Workshop on Partiality and Recursion in Interactive Theorem Provers, Edinburgh, UK, 2010.Google ScholarCross Ref
- }}Nils Anders Danielsson and Thorsten Altenkirch. Subtyping, declaratively: An exercise in mixed induction and coinduction. In Mathematics of Program Construction, Tenth International Conference, MPC 2010, volume 6120 of LNCS, pages 100--118, 2010. Google ScholarDigital Library
- }}Nils Anders Danielsson and Ulf Norell. Structurally recursive descent parsing. Unpublished note, 2008.Google Scholar
- }}Nils Anders Danielsson and Ulf Norell. Parsing mixfix operators. To appear in the proceedings of the 20th International Symposium on the Implementation and Application of Functional Languages (IFL 2008), 2009.Google Scholar
- }}Jon Fairbairn. Making form follow function: An exercise in functional programming style. Software: Practice and Experience, 17 (6): 379--386, 1987. Google ScholarDigital Library
- }}Jeroen Fokker. Functional parsers. In Advanced Functional Programming, volume 925 of LNCS, pages 1--23, 1995. Google ScholarDigital Library
- }}Richard A. Frost, Rahmatullah Hafiz, and Paul Callaghan. Parser combinators for ambiguous left-recursive grammars. In PADL 2008: Practical Aspects of Declarative Languages, volume 4902 of LNCS, pages 167--181, 2008. Google ScholarDigital Library
- }}Tatsuya Hagino. A Categorical Programming Language. PhD thesis, University of Edinburgh, 1987. Google ScholarDigital Library
- }}Peter Hancock, Dirk Pattinson, and Neil Ghani. Representations of stream processors using nested fixed points. Logical Methods in Computer Science, 5 (3:9), 2009.Google ScholarCross Ref
- }}R. John M. Hughes and S. Doaitse Swierstra. Polish parsers, step by step. In ICFP '03: Proceedings of the eighth ACM SIGPLAN international conference on Functional programming, pages 239--248, 2003. Google ScholarDigital Library
- }}Graham Hutton. Higher-order functions for parsing. Journal of Functional Programming, 2: 323--343, 1992.Google ScholarCross Ref
- }}Mark Johnson. Memoization in top-down parsing. Computational Linguistics, 21 (3): 405--417, 1995. Google ScholarDigital Library
- }}Oleg Kiselyov. Parsec-like parser combinator that handles left recursion? Message to the Haskell-Cafe mailing list, December 2009.Google Scholar
- }}Pieter Koopman and Rinus Plasmeijer. Efficient combinator parsers. In IFL'98: Implementation of Functional Languages, volume 1595 of LNCS, pages 120--136, 1999. Google ScholarDigital Library
- }}Adam Koprowski and Henri Binsztok. TRX: A formally verified parser interpreter. In Programming Languages and Systems, 19th European Symposium on Programming, ESOP 2010, volume 6012 of LNCS, pages 345--365, 2010. Google ScholarDigital Library
- }}Dexter Kozen. On Kleene algebras and closed semirings. In Mathematical Foundations of Computer Science 1990, volume 452 of LNCS, pages 26--47, 1990. Google ScholarDigital Library
- }}Daan Leijen and Erik Meijer. Parsec: Direct style monadic parser combinators for the real world. Technical Report UU-CS-2001-35, Department of Information and Computing Sciences, Utrecht University, 2001.Google Scholar
- }}Paul Lickman. Parsing with fixed points. Master's thesis, University of Cambridge, 1995.Google Scholar
- }}Peter Ljunglöf. Pure functional parsing; an advanced tutorial. Licentiate thesis, Department of Computing Science, Chalmers University of Technology and Göteborg University, 2002.Google Scholar
- }}Antoni W. Mazurkiewicz. A note on enumerable grammars. Information and Control, 14 (6): 555--558, 1969.Google ScholarCross Ref
- }}Conor McBride and James McKinna. Seeing and doing. Presentation (given by McBride) at the Workshop on Termination and Type Theory, Hindås, Sweden, 2002.Google Scholar
- }}Conor McBride and Ross Paterson. Applicative programming with effects. Journal of Functional Programming, 18: 1--13, 2008. Google ScholarDigital Library
- }}Erik Meijer. Calculating Compilers. PhD thesis, Nijmegen University, 1992.Google Scholar
- }}Paul Francis Mendler. Inductive Definition in Type Theory. PhD thesis, Cornell University, 1988.Google ScholarDigital Library
- }}Muad‘Dib. Strongly specified parser combinators. Post to the Muad‘Dib blog, 2009.Google Scholar
- }}Ulf Norell. Towards a practical programming language based on dependent type theory. PhD thesis, Chalmers University of Technology and Göteborg University, 2007.Google Scholar
- }}David Park. On the semantics of fair parallelism. In Abstract Software Specifications, volume 86 of LNCS, pages 504--526, 1980. Google ScholarDigital Library
- }}Frank Pfenning. Unification and anti-unification in the calculus of constructions. In Proceedings of the Sixth Annual IEEE Symposium on Logic in Computer Science, pages 74--85, 1991.Google ScholarCross Ref
- }}J. J. M. M. Rutten. Automata and coinduction (an exercise in coalgebra). In CONCUR'98, Concurrency Theory, 9th International Conference, volume 1466 of LNCS, pages 547--554, 1998. Google ScholarDigital Library
- }}Niklas Röjemo. Garbage collection, and memory efficiency, in lazy functional languages. PhD thesis, Chalmers University of Technology and University of Göteborg, 1995.Google Scholar
- }}Marvin Solomon. Theoretical Issues in the Implementation of Programming Languages. PhD thesis, Cornell University, 1977. Google ScholarDigital Library
- }}S. Doaitse Swierstra and Luc Duponcheel. Deterministic, error-correcting combinator parsers. In Advanced Functional Programming, volume 1129 of LNCS, pages 184--207, 1996. Google ScholarDigital Library
- }}Wouter Swierstra. A Hoare logic for the state monad. In Theorem Proving in Higher Order Logics, 22nd International Conference, TPHOLs 2009, volume 5674 of LNCS, pages 440--451, 2009. Google ScholarDigital Library
- }}Philip Wadler. How to replace failure by a list of successes; a method for exception handling, backtracking, and pattern matching in lazy functional languages. In Functional Programming Languages and Computer Architecture, volume 201 of LNCS, pages 113--128, 1985. Google ScholarDigital Library
- }}Philip Wadler, Walid Taha, and David MacQueen. How to add laziness to a strict language, without even being odd. In Proceedings of the 1998 ACM SIGPLAN Workshop on ML, 1998.Google Scholar
- }}Malcolm Wallace. Partial parsing: Combining choice with commitment. In IFL 2007: Implementation and Application of Functional Languages, volume 5083 of LNCS, pages 93--110, 2008. Google ScholarDigital Library
Index Terms
- Total parser combinators
Recommendations
Practical, general parser combinators
PEPM '16: Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program ManipulationParser combinators are a popular approach to parsing where context-free grammars are represented as executable code. However, conventional parser combinators do not support left recursion, and can have worst-case exponential runtime. These limitations ...
Total parser combinators
ICFP '10A monadic parser combinator library which guarantees termination of parsing, while still allowing many forms of left recursion, is described. The library's interface is similar to those of many other parser combinator libraries, with two important ...
A hierarchy of mendler style recursion combinators: taming inductive datatypes with negative occurrences
ICFP '11: Proceedings of the 16th ACM SIGPLAN international conference on Functional programmingThe Mendler style catamorphism (which corresponds to weak induction) always terminates even for negative inductive datatypes. The Mendler style histomorphism (which corresponds to strong induction) is known to terminate for positive inductive datatypes. ...
Comments