skip to main content
10.1145/2701336.2701636acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfireConference Proceedingsconference-collections
research-article

Overview of the FIRE 2013 Track on Transliterated Search

Published:04 December 2013Publication History

ABSTRACT

In this paper, we provide an overview of the FIRE 2013 track on transliterated search and describe the datasets released as part of the track. This was the first year that the track was organized. We had proposed two subtasks as part of the challenge. In the first subtask, which we had proposed for Hindi, Bangla, and Gujarati, participants had to devise an algorithm to label the true languages of words in a sentence. Additionally, if a non-English word was identified, the algorithm was also supposed to provide the transliteration of the word in the native script. The second subtask was retrieval-based, where mixed-script documents had to be retrieved and ranked by relevance in response to ad hoc queries. The queries in our dataset were Bollywood Hindi song lyrics, in Roman script. We received a total of 25 run submissions from five different teams across the world (three from India and two from abroad). Conducting this track helped us generate awareness about the importance of transliteration in the context of Indian languages. Results show that there is considerable scope for improvement of transliteration accuracies for the studied languages.

References

  1. U. Z. Ahmed, K. Bali, M. Choudhury, and S. V. B. Challenges in designing input method editors for indian languages: The role of word-origin and context. Advances in Text Input Methods (WTIM 2011), pages 1--9, 2011.Google ScholarGoogle Scholar
  2. P. Antony and K. Soman. Machine transliteration for indian languages: A literature survey. International Journal of Scientific & Engineering Research, IJSER, 2:1--8, 2011.Google ScholarGoogle Scholar
  3. K. Gupta, M. Choudhury, and K. Bali. Mining hindi-english transliteration pairs from online hindi lyrics. In LREC, pages 2459--2465, 2012.Google ScholarGoogle Scholar
  4. K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst., 20:422--446, October 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. King and S. Abney. Labeling the languages of words in mixed-language documents using weakly supervised methods. In Proceedings of NAACL-HLT, pages 1110--1119, 2013.Google ScholarGoogle Scholar
  6. K. Knight and J. Graehl. Machine transliteration. Computational Linguistics, 24(4):599--612, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. U. Quasthoff, M. Richter, and C. Biemann. Corpus portal for search in monolingual corpora. In Proceedings of the fifth international conference on language resources and evaluation, pages 1799--1802, 2006.Google ScholarGoogle Scholar
  8. G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, Inc., 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. V. Sowmya, M. Choudhury, K. Bali, T. Dasgupta, and A. Basu. Resource creation for training and testing of transliteration systems for indian languages. In LREC, 2010.Google ScholarGoogle Scholar
  10. E. M. Voorhees and D. M. Tice. The TREC-8 Question Answering Track Evaluation. In TREC-8, pages 83--105, 1999.Google ScholarGoogle Scholar

Index Terms

  1. Overview of the FIRE 2013 Track on Transliterated Search

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              FIRE '12 & '13: Proceedings of the 4th and 5th Annual Meetings of the Forum for Information Retrieval Evaluation
              December 2013
              105 pages
              ISBN:9781450328302
              DOI:10.1145/2701336
              • Editors:
              • Prasenjit Majumder,
              • Mandar Mitra,
              • Madhulika Agrawal,
              • Parth Mehta

              Copyright © 2013 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 4 December 2013

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited

              Acceptance Rates

              Overall Acceptance Rate19of64submissions,30%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader