skip to main content
10.1145/1363686.1364054acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Gcon: a graph-based technique for resolving ambiguity in query translation candidates

Published: 16 March 2008 Publication History

Abstract

In the field of cross-language information retrieval (CLIR), the resolution of lexical ambiguity is a key challenge. Common mechanisms for the translation of query terms from one language to another typically produce a set of possible translation candidates, rather than some authoritative result. Correctly reducing a list of possible candidates down to a single translation is an enduring problem. Thus far, solutions have concentrated upon the use of the use of term co-occurrence information to guide the process of resolving translation-based ambiguity. In this paper we introduce a new disambiguation strategy which employs a graph-based analysis of generated co-occurrence data to determine the most appropriate translation for a given term.

References

[1]
Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual Web search engine. In Proceedings of the Seventh international Conference on World Wide Web 7 (Brisbane, Australia). P. H. Enslow and A. Ellis, Eds. Elsevier Science Publishers B. V., Amsterdam, The Netherlands, 107--117.
[2]
Brody, S., Navigli, R., and Lapata, M. 2006. Ensemble methods for unsupervised WSD. In Proceedings of the ACL/COLING, Sydney, Australia.
[3]
Erkan, G. and Radev, D. R. 2004. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. Journal of Artificial Intelligence. 457--479.
[4]
Gao, J. and Nie, J. 2006. A study of statistical models for query translation: finding a good unit of translation. In Proceedings of the 29th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (Seattle, Washington, USA, August 06 - 11, 2006). SIGIR '06. ACM Press, New York, NY, 194--201.
[5]
Gao, J., Zhou, M., Nie, J., He, H., and Chen, W. 2002. Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations. In Proceedings of the 25th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (Tampere, Finland, August 11--15, 2002). SIGIR '02. ACM Press, New York, NY, 183--190.
[6]
Jang, M., Myaeng, S. H., and Park, S. Y. 1999. Using mutual information to resolve query translation ambiguities and query term weighting. In Proceedings of the 37th Annual Meeting of the Association For Computational Linguistics on Computational Linguistics (College Park, Maryland, June 20 - 26, 1999). Annual Meeting of the ACL. Association for Computational Linguistics, Morristown, NJ, 223--229.
[7]
Kleinberg, J. M. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM. 46(5): 604--632.
[8]
Liu, Y., Jin, R., and Chai, J. Y. 2005. A maximum coherence model for dictionary-based cross-language information retrieval. In Proceedings of the 28th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (Salvador, Brazil, August 15 - 19, 2005). SIGIR '05. ACM Press, New York, NY, 536--543.
[9]
Maeda, A., Sadat, F., Yoshikawa, M., and Uemura, S. 2000. Query term disambiguation for Web cross-language information retrieval using a search engine. In Proceedings of the Fifth international Workshop on on information Retrieval with Asian Languages (Hong Kong, China, September 30 - October 01, 2000). IRAL '00. ACM Press, New York, NY, 25--32.
[10]
Mihalcea, R. and Tarau, P. 2004. Textrank: Bringing order into texts. In L. Dekang and W. Dekai, editors, Proceedings of EMNLP 2004, Barcelona, Spain, July 2004. Association for Computational Linguistics. 404--411.
[11]
Mihalcea, R. 2005. Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing Association for Computational Linguistics. 411--418.
[12]
Mihalcea, R., Tarau, P., and Figa, E. 2004. PageRank on semantic networks, with application to word sense disambiguation. In Proceedings of the 20th international Conference on Computational Linguistics Association for Computational Linguistics, 1126.
[13]
Mohammad, S. and Hirst, G. 2006. Determining word sense dominance using a thesaurus. In Proceedings of the 11th EACL, Trento, Italy. 121--128.
[14]
Monz, C. and Dorr, B. J. 2005. Iterative translation disambiguation for cross-language information retrieval. In Proceedings of the 28th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (Salvador, Brazil, August 15--19, 2005). SIGIR '05. ACM Press, New York, NY, 520--527.
[15]
Navigli, R. and Lapata, M. 2007. Graph Connectivity Measures for Unsupervised Word Sense Disambiguation. Proceedings of the 20th International Joint Conference on Artificial Intelligence, 1683--1688.
[16]
Zhou, D., Goulding, J., Truran, M., and Brailsford, T. 2007. LLAMA: automatic hypertext generation utilizing language models. Proceedings of the 18th conference on Hypertext and hypermedia, Manchester, UK. 77--80.

Cited By

View all
  • (2015)A parallel cross-language retrieval system for patent documents2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)10.1109/ICSESS.2015.7339147(672-676)Online publication date: Sep-2015
  • (2012)Translation techniques in cross-language information retrievalACM Computing Surveys10.1145/2379776.237977745:1(1-44)Online publication date: 7-Dec-2012

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '08: Proceedings of the 2008 ACM symposium on Applied computing
March 2008
2586 pages
ISBN:9781595937537
DOI:10.1145/1363686
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 March 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. co-occurrence measure
  2. cross language information retrieval
  3. disambiguation
  4. graph analysis

Qualifiers

  • Research-article

Conference

SAC '08
Sponsor:
SAC '08: The 2008 ACM Symposium on Applied Computing
March 16 - 20, 2008
Fortaleza, Ceara, Brazil

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2015)A parallel cross-language retrieval system for patent documents2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)10.1109/ICSESS.2015.7339147(672-676)Online publication date: Sep-2015
  • (2012)Translation techniques in cross-language information retrievalACM Computing Surveys10.1145/2379776.237977745:1(1-44)Online publication date: 7-Dec-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media