|
ABSTRACT
This paper deals with the use of ontologies for Information Retrieval. Roughly, the proposed approach consists in identifying important concepts in documents using two criterions, co-occurrence and semantic relatedness and then disambiguating them via an external general purpose ontology, namely WordNet. Matching the ontology and a document results in a set of scored concept-senses (nodes) with weighted links. This representation, called semantic core of a document best reveals the semantic content of the document. We regard our approach, of which the first evaluation results are encouraging, as a short but strong step toward the long term goal of Intelligent Indexing and Semantic Retrieval.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
OntoQuery project net site: <u>http://www.ontoquery.dk</u>
|
| |
2
|
|
| |
3
|
|
| |
4
|
Boughanem, M., Dkaki, T., Mothe, J., et Soulé-Dupuy, C. "Mercure at TREC-7". In Proceeding of Trec-7, (1998).
|
| |
5
|
|
 |
6
|
|
| |
7
|
Joon Ho Lee, Myong Ho Kim, and Yoon Joon Lee. "Information retrieval based on conceptual distance in IS-A hierarchies". Journal of Documentation, 49(2):188{207, June 1993.
|
| |
8
|
Haav, H. M., Lubi, T.-L.: A Survey of Concept-based Information Retrieval Tools on the Web. In Proc. of 5th East-European Conference ADBIS*2001, Vol 2., Vilnius "Technika", pp. 29--41.
|
| |
9
|
Gonzalo, J., Verdejo, F., Chugur I., Cigarrán J.: Indexing with WordNet synsets can improve text retrieval, in Proc. the COLING/ACL '98 Workshop on Usage of WordNet for Natural Language Processing, 1998.
|
| |
10
|
Cucchiarelli, R. Navigli, F. Neri, P. Velardi. Extending and Enriching WordNet with OntoLearn, Proc. of The Second Global Wordnet Conference 2004 (GWC 2004), Brno, Czech Republic, January 20-23rd, 2004.
|
| |
11
|
Hirst, G., and St. Onge, D.: Lexical chains as representations of context for the detection and correction of malapropisms. In C. Fellbaum, editor, WordNet: An electronic lexical database, pages 305--332. MIT Press, 1998.
|
| |
12
|
Resnik, P., "Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language", Journal of Artificial Intelligence Research (JAIR), 11, pp. 95--130, 1999.
|
| |
13
|
|
 |
14
|
|
 |
15
|
W. Bruce Croft , Howard R. Turtle , David D. Lewis, The use of phrases and structured queries in information retrieval, Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval, p.32-45, October 13-16, 1991, Chicago, Illinois, United States
[doi> 10.1145/122860.122864]
|
| |
16
|
Huang, X. and Robertson, S. E. "Comparisons of Probabilistic Compound Unit Weighting Methods", Proc. of the ICDM'01 Workshop on Text Mining, San Jose, USA, Nov. 2001.
|
| |
17
|
Magnini, B. and Cavaglia, G.: Integrating Subject Field Codes into WordNet. In Proc. of the 2nd International Conference on Language resources and Evaluation, LREC2000, Atenas.
|
| |
18
|
Buitelaar, P., Steffen, D., Volk, M., Widdows, D., Sacaleanu, B., Vintar, S., Peters S., Uszkoreit, H., Evaluation Resources for Concept-based Cross-Lingual IR in the Medical Domain In Proc. of LREC2004, Lissabon, Portugal, May 2004.
|
| |
19
|
The Sixth Text REtrieval Conference (TREC{6). Edited by E. M. Voorhees and D. K. Harman. Gaithersburg, MD: NIST, 1998.
|
| |
20
|
Vossen P., 2001. Extending, Trimming and Fusing WordNet for technical Documents, NAACL 2001 workshop on WordNet and Other Lexical Resources, Pittsbourgh, July 2001.
|
| |
21
|
Maedche A. and Staab S., 2000. Semi-automatic Engineering of Ontologies from Text. Proceedings of the Twelfth International Conference on Software Engineering and Knowledge Engineering.
|
|