ABSTRACT
The words a user is likely to write comprise the user's active vocabulary. This vocabulary is considerably smaller than the passive vocabulary of words a user reads. We explore an interactive adaptive lexicon method that separates a large lexicon into active and passive sets, and gradually expands and adapts the active set to reflect the user's active vocabulary. The adaptation is achieved through lightweight interaction as a by product of actual use. The effectiveness of the technique is demonstrated through a computational experiment and a user study.
- Connell, S. D. and Jain, A. K. Writer Adaptation for Online Handwriting Recognition. IEEE Trans. PAMI, 24(3), 2002, 329--346. Google ScholarDigital Library
- Furnas, G. W., Landauer, T. K., Gomez, L. M. and Dumais, S. T. The Vocabulary Problem in Human-System Communication. Comm. ACM, 30(11), 1987, 964--971. Google ScholarDigital Library
- Klimt, B. and Yang, Y. Introducing the Enron Corpus. Proc. CEAS 2004.Google Scholar
- Kristensson, P. O. and Zhai, S. SHARK2: A Large Vocabulary Shorthand-Writing System for Pen-Based Computers. Proc. ACM UIST 2004, 43--52. Google ScholarDigital Library
- Simpson, J. and Weiner, E. C. Oxford English Dictionary. Clarendon Press, 1989.Google Scholar
- Xue, H. and Govindaraju, V. On the Dependence of Handwritten Word Recognizers on Lexicons. IEEE Trans. PAMI, 24(12), 2002, 1553--1564. Google ScholarDigital Library
Index Terms
- Improving word-recognizers using an interactive lexicon with active and passive words
Recommendations
A lexicon of multiword expressions for linguistically precise, wide-coverage natural language processing
Since Sag et al. (2002) highlighted a key problem that had been underappreciated in the past in natural language processing (NLP), namely idiosyncratic multiword expressions (MWEs) such as idioms, quasi-idioms, cliches, quasi-cliches, institutionalized ...
The Development of a Standard Morpho-Syntactic Lexicon for Arabic NLP
LOPAL '18: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and ApplicationsIn this paper, we present a linguistic resource developed at our institute which will soon be available in open source. ALIF (Arabic Lexicon Inflected Forms) is a morpho-syntactic lexicon of the inflected forms of the Arabic language in which each ...
Hindi Word Sense Disambiguation Using Lesk Approach on Bigram and Trigram Words
AICTC '16: Proceedings of the International Conference on Advances in Information Communication Technology & ComputingWord Sense Disambiguation (WSD) is a vital task which provides the definition of particular words according to their sense or according to given context. Lesk algorithm is originally based on the gloss overlap that can be observed as the measure, ...
Comments