skip to main content
article

Functional morphology

Published: 19 September 2004 Publication History

Abstract

This paper presents a methodology for implementing natural language morphology in the functional language Haskell. The main idea behind is simple: instead of working with untyped regular expressions, which is the state of the art of morphology in computational linguistics, we use finite functions over hereditarily finite algebraic datatypes. The definitions of these datatypes and functions are the language-dependent part of the morphology. The language-independent part consists of an untyped dictionary format which is used for synthesis of word forms, and a decorated trie, which is used for analysis.Functional Morphology builds on ideas introduced by Huet in his computational linguistics toolkit Zen, which he has used to implement the morphology of Sanskrit. The goal has been to make it easy for linguists, who are not trained as functional programmers, to apply the ideas to new languages. As a proof of the productivity of the method, morphologies for Swedish, Italian, Russian, Spanish, and Latin have already been implemented using the library. The Latin morphology is used as a running example in this article.

References

[1]
I. Andersson and T. Söderberg. Spanish Morphology - implemented in a functional programming lan-guage. Master's Thesis in Computational Linguistics, 2003. http://www.cling.gu.se/theses/finished.html.
[2]
C. E. Bennett. A Latin Grammar. Allyn and Bacon, Boston and Chicago, 1913.
[3]
L. Bogavac. Functional Morphology for Russian. Master's Thesis in Computing Science, 2004.
[4]
K. Claessen. An Embedded Language Approach to Hardware Description and Verification. PhD thesis, Chalmers University of Technology, 2000.
[5]
E. Conrad. Latin grammar. www.math.ohio-state.edu/~conrad/lang/latin.html, 2004.
[6]
M. Forsberg. Fststudio. http://www.cs.chalmers.se/~markus/fstStudio
[7]
M. Forsberg and A. Ranta. Svenska ord. http://www.cs.chalmers.se/~markus/svenska, 2002.
[8]
M. Forsberg and A. Ranta. Functional morphology. http://www.cs.chalmers.se/~markus/FM, 2004.
[9]
C. F. Hockett. Two models of grammatical description. Word, 10:210--234, 1954.
[10]
P. Hudak. Building domain-specific embedded languages. ACM Computing Surveys, 28(4), 1996.
[11]
P. Hudak. Modular domain specific languages and tools. In P. Devanbu and J. Poulin, editors, Proceedings: Fifth International Conference on Software Reuse, pages 134--142. IEEE Computer Society Press, 1998.
[12]
G. Huet. Sanskrit site. Program and documentation, http://pauillac.inria.fr/~huet/SKT/, 2000.
[13]
G. Huet. The Zen Computational Linguistics Toolkit. http://pauillac.inria.fr/~huet/, 2002.
[14]
G. Huet. Transducers as lexicon morphisms, phonemic segmentation by euphony analysis, application to a sanskrit tagger. Available: http://pauillac.inria.fr/~huet/FREE/, 2003.
[15]
L. K. Kenneth R. Beesley. Finite State Morphology. CSLI Publications, United States, 2003.
[16]
G. Klyve. Latin Grammar. Hodder & Stoughton Ltd., London, 2002.
[17]
K. Koskenniemi. Two-level morphology: a general computational model for word-form recognition and production. PhD thesis, University of Helsinki, 1983.
[18]
G. G. L. Karttunen, J-P Chanond and A. Schille. Regular expressions for language engineering. Natural Language Engineering, 2:305--328, 1996.
[19]
A. Labs-Research. At&t fsm library. http://www.research.att.com/sw/tools/fsm/.
[20]
J. Lambek. A mathematician looks at the latin conjugation. Theoretical Linguistics, 1977.
[21]
E. Meijer and J. van Dijk. Perl for swine: Cgi programming in haskell. Proc. First Workshop on Functional Programming, 1996.
[22]
S. Peyton Jones and J. Hughes. Report on the Programming Language Haskell 98, a Non-strict, Purely Functional Language. Available from http://www.haskell.org, February 1999.
[23]
A. Ranta. Grammatical Framework Homepage, 2000-2004. www.cs.chalmers.se/~aarne/GF/.
[24]
A. Ranta. 1+n representations of Italian morphology. Essays dedicated to Jan von Plato on the occasion of his 50th birthday, http://www.valt.helsinki.fi/kfil/jvp50.htm, 2001.
[25]
A. Ranta. Grammatical Framework: A Type-theoretical Grammar Formalism. The Journal of Functional Programming, 14(2):145--189, 2004.
[26]
M. K. Ronald M. Kaplan. Regular Models of Phonological Rule Systems. Computational lingustics, pages 331--380, 1994.
[27]
The World Wide Web Consortium. Extensible Markup Language (XML). http://www.w3.org/XML/, 2000.
[28]
G. van Noord. Finite state automata utilities. http://odur.let.rug.nl/~vannoord/Fsa/
[29]
Xerox. Xerox finite-state compiler. http://www.xrce.xerox.com/competencies/content-analysis/fsCompiler/.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 39, Issue 9
ICFP '04
September 2004
254 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1016848
Issue’s Table of Contents
  • cover image ACM Conferences
    ICFP '04: Proceedings of the ninth ACM SIGPLAN international conference on Functional programming
    September 2004
    264 pages
    ISBN:1581139055
    DOI:10.1145/1016850
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 September 2004
Published in SIGPLAN Volume 39, Issue 9

Check for updates

Author Tags

  1. embedded languages
  2. finite functions
  3. functional programming
  4. linguistics
  5. morphological description

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Improving Neural Machine Translation Using Rule-Based Machine Translation2019 7th International Conference on Smart Computing & Communications (ICSCC)10.1109/ICSCC.2019.8843685(1-5)Online publication date: Jun-2019
  • (2014)Morphological Processing of Semitic LanguagesNatural Language Processing of Semitic Languages10.1007/978-3-642-45358-8_2(43-66)Online publication date: 25-Mar-2014
  • (2013)Issues in Arabic Computational LinguisticsThe Oxford Handbook of Arabic Linguistics10.1093/oxfordhb/9780199764136.013.010_update_001(213-240)Online publication date: 16-Dec-2013
  • (2012)Natural Language Processing for Historical TextsSynthesis Lectures on Human Language Technologies10.2200/S00436ED1V01Y201207HLT0175:2(1-157)Online publication date: 24-Sep-2012
  • (2020)Abstract Syntax as InterlinguaComputational Linguistics10.1162/coli_a_0037846:2(425-486)Online publication date: 1-Jun-2020
  • (2020)Abstract Syntax as Interlingua: Scaling Up the Grammatical Framework from Controlled Languages to Robust PipelinesComputational Linguistics10.1162/COLI_a_00378(1-92)Online publication date: 23-Mar-2020
  • (2014)Morphological Processing of Semitic LanguagesNatural Language Processing of Semitic Languages10.1007/978-3-642-45358-8_2(43-66)Online publication date: 25-Mar-2014
  • (2011)A Diachronic Computational Lexical Resource for 800 Years of SwedishLanguage Technology for Cultural Heritage10.1007/978-3-642-20227-8_3(41-61)Online publication date: 26-Apr-2011
  • (2010)Introduction to Arabic Natural Language ProcessingSynthesis Lectures on Human Language Technologies10.2200/S00277ED1V01Y201008HLT0103:1(1-187)Online publication date: Jan-2010
  • (2010)Investigating the Relationship Between Linguistic Representation and Computation through an Unsupervised Model of Human Morphology LearningResearch on Language and Computation10.1007/s11168-011-9077-28:2-3(209-238)Online publication date: 1-Sep-2010
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media