ABSTRACT
The Reduplicated Multiword Expression (RMWE) is identified using Support Vector Machine (SVM) and these identified RMWE is used as a feature for the SVM based POS tagging of Manipuri, which is a very highly agglutinative Indian Schedule Language. A common features approach for both RMWE identification and POS tagging is tried. Identification of RMWE using SVM shows the Recall of 86.11%, Precision of 92.08% and F-measure of 88.99%. The identified RMWE is used as a feature in the SVM based POS tagger which results with the Recall of 71.15%, Precision of 83.15% and F-measure of 76.68%.
- Brill, Eric.: A Simple Rule-based Part of Speech Tagger. In the Proceedings of Third International Conference on Applied Natural Language Processing, ACL, Trento, Italy (1992). Google ScholarDigital Library
- Brill, Eric.: Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part of Speech Tagging, Computational Linguistics, Vol. 21(4), pp543--545, (1995). Google ScholarDigital Library
- Ratnaparakhi, A.: A maximum entropy Parts- of- Speech Tagger, In the Proceedings EMNLP 1, ACL, pp133--142(1996).Google Scholar
- Kupiec, R..: Part-of-speech tagging using a Hidden Markov Model, In Computer Speech and Language, Vol 6, No 3, pp225--242(1992).Google ScholarCross Ref
- Lin, Y. C., Chiang, T. H. & Su, K. Y.: Discrimination oriented probabilistic tagging, In the Proceedings of ROCLING V, pp87--96(1992).Google Scholar
- Chang, C. H.& Chen, C. D.: HMM-based Part-of-Speech Tagging for Chinese Corpora, In the Proceedings of the Workshop on Very Large Corpora, Columbus, Ohio, pp40--47(1993).Google Scholar
- Lua, K. T.: Part of Speech Tagging of Chinese Sentences Using Genetic Algorithm, In the Proceedings of ICCC96, National University of Singapore, pp45--49 (1996).Google Scholar
- Ekbal, Asif, Mondal, S & Sivaji Bandyopadhyay: POS Tagging using HMM and Rule-based Chunking, In the Proceedings of SPSAL2007, IJCAI, pp25--28, India (2007).Google Scholar
- Ekbal, Asif, R. Haque & & Sivaji Bandyopadhyay: Bengali Part of Speech Tagging using Conditional Random Field, In the Proceedings 7th SNLP, Thailand (2007).Google Scholar
- Ekbal, Asif, Haque, R. & & Sivaji Bandyopadhyay: Maximum Entropy based Bengali Part of Speech Tagging, In A. Gelbukh (Ed.), Advances in Natural Language Processing and Applications, Research in Computing Science (RCS) Journal, Vol.(33), pp67--78 (2008).Google Scholar
- Smriti Singh, Kuhoo Gupta, Manish Shrivastava, & Pushpak Bhattacharya: Morphological Richness offsets Resource Demand --Experiences in constructing a POS tagger for Hindi, In the Proceedings of COLING- ACL, Sydney, Australia (2006). Google ScholarDigital Library
- Antony, P. J.; Mohan, S. P.; Soman, K. P.:SVM Based Part of Speech Tagger for Malayalam, In the Proceedings of International Conference on Recent Trends in Information, Telecommunication and Computing (ITC), pp. 339--341, Kochi, Kerala, India (2010). Google ScholarDigital Library
- Ekbal, Asif, Mondal, S & Sivaji Bandyopadhyay: Part of Speech Tagging in Bengali Using SVM, In Proceedings of International Conference on Information Technology(ICIT), pp. 106--111, Bhubaneswar, India (2008) Google ScholarDigital Library
- Doren Singh, T. & Sivaji Bandyopadhyay, Morphology Driven Manipuri POS Tagger, In the Proceeding of IJCNLP NLPLPL 2008, pp91--97, IIIT Hyderabad (2008).Google Scholar
- {Doren Singh, T., Ekbal, A. & Sivaji Bandyopadhyay: Manipuri POS tagging using CRF and SVM: A language independent approach, In the proceeding of 6th International conference on Natural Language Processing (ICON -2008), pp 240--245 Pune, India (2008).Google Scholar
- Kishorjit, N., & Sivaji Bandyopadhyay, Identification of Reduplicated MWEs in Manipuri: A Rule based Approached. In the Proceeding of 23rd International Conference on the Computer Processing of Oriental Languages (ICCPOL-2010), pp49--54, Redwood City, San Francisco, (2010).Google Scholar
- Kishorjit, N., Dhiraj, L., Bikramjit Singh, N., Mayekleima Chanu, Ng. & Sivaji Bandyopadhyay: Identification of Reduplicated Multiword Expressions Using CRF, A. Gelbukh (Ed.):CICLing 2011, LNCS vol.6608, Part I, pp41--51, Berlin, Germany: Springer-Verlag (2011). Google ScholarDigital Library
- Doren Singh, T. & Sivaji Bandyopadhyay: Web Based Manipuri Corpus for Multiword NER and Reduplicated MWEs Identification using SVM, In the Proceedings of the 1st Workshop on South and Southeast Asian Natural Language Processing (WSSANLP), the 23rd International Conference on Computational Linguistics (COLING), pp35--42, Beijing (2010).Google Scholar
- Dipankar Das, Santanu Pal, Tapabrata Mondal, Tanmoy C & Sivaji Bandyopadhyay: Automatic Extraction of Complex Predicates in Bengali. In the Workshop on Multiword Expressions: from theory to Applications (MWE 2010), 23rd COLING 2010, pp.36--44, August 28, Beijing, ChinaGoogle Scholar
- Tanmoy C, Dipankar Das & Sivaji Bandyopadhyay: Semantic Clustering: an Attempt to Identify Multiword Expressions in Bengali. In the proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World (MWE 2011), 49th Annual Meeting of ACL-HLT 2011, pp. 8--13, Portland, Oregon, USA Google ScholarDigital Library
- Nonigopal Singh, N: A Meitei Grammar of Roots and Affixes, A Thesis, (Unpublished), Manipur University, Imphal (1987)Google Scholar
- Ch. S. Yashawanta, "Manipuri Grammar," Rajesh Publications, Delhi, 2000, pp.190--204Google Scholar
- Kishorjit, N., Bishworjit, S., Romina, M., Mayekleima Chanu, Ng. & Sivaji Bandyopadhyay: A Light Weight Manipuri Stemmer, In the Proceedings of National Conference on Indian Language Computing (NCILC), Chochin, India (2011).Google Scholar
- Vapnik, Vladimir N.:The Nature of Statistical Learning Theory. Springer (1995). Google ScholarDigital Library
- Cheng-Lung Huang & Chieh-Jen Wang: A GA-based feature selection and parameters optimization for support vector machines, Expert Systems with Applications 31 (2006), doi:10.1016/j.eswa.2005.09.024, Elsevier Publication, pp. 231--240 (2006)Google Scholar
- Joachims, T.: Making Large Scale SVM Learning Practical. In B. Scholkopf, C. Burges and A. Smola editions, Advances in Kernel Methods-Support Vector Learning (1999) Google ScholarDigital Library
Index Terms
- SVM based Manipuri POS tagging using SVM based identified reduplicated MWE (RMWE)
Recommendations
Will the identification of reduplicated multiword expression (RMWE) improve the performance of SVM based manipuri POS tagging?
CICLing'12: Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part IReduplicated Multiword Expressions (RMWEs) are abundant in Manipuri, the highly agglutinative India language. The Part of Speech (POS) tagging of Manipuri using Support Vector Machine (SVM) has been developed and evaluated. The POS tagger has been ...
Identification of reduplicated multiword expressions using CRF
CICLing'11: Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part IThis paper deals with the identification of Reduplicated Multiword Expressions (RMWEs) which is important for any natural language applications like Machine Translation, Information Retrieval etc. In the present task, reduplicated MWEs have been ...
Manipuri–English comparable corpus for cross-lingual studies
AbstractThis paper presents Mni-EnCC, a temporal alligned Manipuri–English comparable corpus, to facilitate cross-lingual studies between Manipuri and English. Mni-EnCC has been created by collating text from two publicly published news sources in ...
Comments