skip to main content
10.1145/1096601.1096643acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
Article

Managing syntactic variation in text retrieval

Published: 02 November 2005 Publication History

Abstract

Information Retrieval systems are limited by the linguistic variation of language. The use of Natural Language Processing techniques to manage this problem has been studied for a long time, but mainly focusing on English. In this paper we deal with European languages, taking Spanish as a case in point. Two different sources of syntactic information, queries and documents, are studied in order to increase the performance of Information Retrieval systems.

References

[1]
S. Abney. Partial parsing via finite-state cascades. Natural Language Engineering, 2(4):337--344, 1997.]]
[2]
M. A. Alonso, J. Vilares, and V. M. Darriba. On the usefulness of extracting syntactic dependencies for text indexing. In volume 2464 of Lecture Notes in Artificial Intelligence, pages 3--11. Springer-Verlag, Berlin-Heidelberg-New York, 2002.]]
[3]
A. Arampatzis, T. P. van der Weide, P. van Bommel, and C. Koster. Linguistically-motivated information retrieval. In Encyclopedia of Library and Information Science, volume 69, pages 201--222. Marcel Dekker, Inc, New York-Basel, 2000.]]
[4]
C. Buckley. Implementation of the SMART information retrieval system. Technical report, Cornell University, 1985. Source code available at tftp://ftp.cs.cornell.edu/pub/smart.]]
[5]
J. Graña. Técnicas de Análisis Sintáctico Robusto para la Etiquetación del Lenguaje Natural. PhD thesis, University of A Coruña, Spain, 2000.]]
[6]
M. Hearst, J. Pedersen, P. Pirolli, H. Schutze, G. Grefenstette, and D. Hull. Xerox site report: Four TREC-4 tracks. In The Fourth Text REtrieval Conference (TREC-4), pages 97--119, 1996.]]
[7]
J. R. Hobbs, D. Appelt, J. Bear, D. Israel, M. Kameyama, M. Stickel, and M. Tyson. FASTUS: A cascaded finite-state transducer for extracting information from natural-language text. In Finite-State Language Processing. MIT Press, 1997.]]
[8]
C. Jacquemin and E. Tzoukermann. NLP for term variant extraction: synergy between morphology, lexicon and syntax. In Strzalkowski {10}, pages 25--74.]]
[9]
J. Rocchio. The SMART Retrieval System - Experiments in Automatic Document Processing, chapter Relevance feedback in information retrieval, pages 313--323. Prentice-Hall, NJ, 1971.]]
[10]
T. Strzalkowski, editor. Natural Language Information Retrieval. Kluwer Academic Publishers, 1999.]]
[11]
J. Vilares and M. A. Alonso. A grammatical approach to the extraction of index terms. In International Conference on Recent Advances in Natural Language Processing, Proceedings, pages 500--504, Borovets, Bulgaria, 2003.]]
[12]
J. Vilares, M. A. Alonso, F. J. Ribadas, and M. Vilares. COLE experiments at CLEF 2002 Spanish monolingual track. In volume 2785 of Lecture Notes in Computer Science, pages 265--278. Springer-Verlag, Berlin-Heidelberg-New York, 2003.]]
[13]
J. Vilares, D. Cabrero, and M. A. Alonso. Applying productive derivational morphology to term indexing of Spanish texts. In volume 2004 of Lecture Notes in Computer Science, pages 336--348. Springer-Verlag, Berlin-Heidelberg-New York, 2001.]]

Cited By

View all
  • (2008)Towards an Enhanced Vector Model to Encode Textual Relations: Experiments Retrieving InformationArtificial Intelligence in Theory and Practice II10.1007/978-0-387-09695-7_37(383-392)Online publication date: 2008
  • (2007)A document engineering environment for clinical guidelinesProceedings of the 2007 ACM symposium on Document engineering10.1145/1284420.1284440(69-78)Online publication date: 28-Aug-2007

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DocEng '05: Proceedings of the 2005 ACM symposium on Document engineering
November 2005
252 pages
ISBN:1595932402
DOI:10.1145/1096601
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. information retrieval
  2. natural language processing
  3. shallow parsing

Qualifiers

  • Article

Conference

DocEng05
Sponsor:
DocEng05: ACM Symposium on Document Engineering
November 2 - 4, 2005
Bristol, United Kingdom

Acceptance Rates

Overall Acceptance Rate 194 of 564 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2008)Towards an Enhanced Vector Model to Encode Textual Relations: Experiments Retrieving InformationArtificial Intelligence in Theory and Practice II10.1007/978-0-387-09695-7_37(383-392)Online publication date: 2008
  • (2007)A document engineering environment for clinical guidelinesProceedings of the 2007 ACM symposium on Document engineering10.1145/1284420.1284440(69-78)Online publication date: 28-Aug-2007

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media