ACM Home Page
Please provide us with feedback. Feedback
Different indexing strategies for multilingual web retrieval: experiments with the EuroGOV corpus
Full text PdfPdf (235 KB)
Source Conference on Hypertext and Hypermedia archive
Proceedings of the seventeenth conference on Hypertext and hypermedia table of contents
Odense, Denmark
POSTER SESSION: Poster table of contents
Pages: 169 - 170  
Year of Publication: 2006
ISBN:1-59593-417-0
Authors
Niels Jensen  University of Hildesheim, Hildesheim, Germany
Thomas Mandl  University of Hildesheim, Hildesheim, Germany
Sponsors
ACM: Association for Computing Machinery
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 29,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1149941.1149974
What is a DOI?

ABSTRACT

Experiments with a multi-lingual web collection are presented. The EuroGOV corpus is the first multi-lingual web corpus for retrieval evaluation. We show how indexes based on words and n-rams are developed for different document parts. Different indexes werde based on the full document content, partial content and the title. The best results were achieved for a title only index based on words.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
Hackl, R.; Mandl, T.; Womser-Hacker, C.: Ad-hoc Mono and Multilingual Retrieval Experiments at the University of Hildesheim. In: Working Notes of the 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005. Sept. 2005, Wien. http://www.clef-campaign.org/2005/working_notes/
 
3
Jensen, N.; Hackl, R.; Mandl, T.; Strötgen, R.: Web Retrieval Experiments with the EuroGOV Corpus at the University of Hildesheim. In: Accessing Multilingual Information Repositories: 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005, Vienna, Austria, Revised Selected Papers. Springer {LNCS 4022} (2006)
 
4
Lucene Project Homepage http://lucene.apache.org
5
 
6
 
7
Sigurbjörnsson, B.; Kamps, J.; de Rijke, M: Blueprint of a crosslingual web retrieval collection. In: Digital Information Management vol. 3 (2005) 9--13.
 
8
Sigurbjörnsson, B.; Kamps, J.; de Rijke, M.: Overview of WebCLEF 2005. In: Working Notes of the 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005. Sept. 2005, Wien. http://www.clef-campaign.org/2005/workingnotes/ workingnotes2005/sigurbjornsson05.pdf

Collaborative Colleagues:
Niels Jensen: colleagues
Thomas Mandl: colleagues