ABSTRACT
In a higher level task such as clustering of web results or
word sense disambiguation, knowledge of all possible distinct concepts in which an ambiguous word can be expressed would be advantageous, for instance in determining the number of clusters in case of clustering web search results. We propose an algorithm to generate such a ranked list of distinct concepts associated with an ambiguous word. Concepts which are popular in terms of usage are ranked higher.
We evaluate the coverage of the concepts inferred from our algorithm on the results retrieved by querying the ambiguous word using a major search engine and show a coverage of 85% for top 30 documents averaged over all keywords.
- S. Cucerzan. Large-scale named entity disambiguation based on Wikipedia data. In Proceedings of EMNLP-CoNLL 2007, pages 708--716, 2007.Google Scholar
- R. Mihalcea. Using wikipedia for automatic word sense disambiguation. NAACL, 2007.Google Scholar
- H.-J. Zeng, Q.-C. He, Z. Chen, W.-Y. Ma, and J. Ma. Learning to cluster web search results. SIGIR, July 2004. Google ScholarDigital Library
Index Terms
Which "Apple" are you talking about ?
Recommendations
Extracting Knowledge from Web Search Engine Results
ICTAI '12: Proceedings of the 2012 IEEE 24th International Conference on Tools with Artificial Intelligence - Volume 01Nowadays, people frequently use search engines in order to find the information they need on the web. However, usually web search engines return web page references in a global ranking making it difficult to the users to browse different topics captured ...
Mining query subtopics from search log data
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrievalMost queries in web search are ambiguous and multifaceted. Identifying the major senses and facets of queries from search log data, referred to as query subtopic mining in this paper, is a very important issue in web search. Through search log analysis, ...
Overviewing the Knowledge of a Query Keyword by Clustering Viewpoints of Web Search Information Needs
WAINA '15: Proceedings of the 2015 IEEE 29th International Conference on Advanced Information Networking and Applications WorkshopsIn this paper, we address the issue of how to overview the knowledge of a given query keyword. We especially focus on concerns of those who search for Web pages with a given query keyword, and study how to efficiently overview the whole list of Web ...
Comments