ABSTRACT
When searching large hypertext document collections, it is often possible that there are too many results available for ambiguous queries. Query refinement is an interactive process of query modification that can be used to narrow down the scope of search results. We propose a new method for automatically generating refinements or related terms to queries by mining anchor text for a large hypertext document collection. We show that the usage of anchor text as a basis for query refinement produces high quality refinement suggestions that are significantly better in terms of perceived usefulness compared to refinements that are derived using the document content. Furthermore, our study suggests that anchor text refinements can also be used to augment traditional query refinement algorithms based on query logs, since they typically differ in coverage and produce different refinements. Our results are based on experiments on an anchor text collection of a large corporate intranet.
- P. Anick. Using terminological feedback for web search refinement: a log-based study. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 88--95. ACM Press, 2003. Google ScholarDigital Library
- P. G. Anick and S. Tipirneni. The paraphrase search assistant: terminological feedback for iterative information seeking. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 153--159. ACM Press, 1999. Google ScholarDigital Library
- S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1--7):107--117, 1998. Google ScholarDigital Library
- A. Z. Broder. A taxonomy of web search. SIGIR Forum, 36(2), 2002. Google ScholarDigital Library
- E. W. Brown and H. A. Chong. The GURU system in TREC-6. In Text REtrieval Conference, pages 535--540, 1997.Google Scholar
- C. Buckley, G. Salton, and J. Allan. The effect of adding relevance information in a relevance feedback environment. In Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Springer-Verlag, 1994. Google ScholarDigital Library
- C. Buckley, G. Salton, J. Allan, and A. Singhal. Automatic query expansion using SMART: TREC 3. In Text REtrieval Conference, pages 69--80, 1994.Google Scholar
- D. Carmel, E. Farchi, Y. Petruschka, and A. Soffer. Automatic query refinement using lexical affinities with maximal information gain. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 283--290. ACM Press, 2002. Google ScholarDigital Library
- S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, P. Raghavan, and S. Rajagopalan. Automatic resource compilation by analyzing hyperlink structure and associated text. Proceedings of the 7th World Wide Web Conference, 1998. Google ScholarDigital Library
- J. Cooper and R. Byrd. OBIWAN a visual interface for prompted query refinement. H1CSS31, Hawaii, USA, 2:277--285, January 1998. Google ScholarDigital Library
- N. Craswell, D. Hawking, and S. Robertson. Effective site finding using link anchor information. In Research and Development in Information Retrieval, pages 250--257, 2001. Google ScholarDigital Library
- C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Rank aggregation methods for the web. In Proceedings of the Tenth International Conference on World Wide Web, pages 613--622. ACM Press, 2001. Google ScholarDigital Library
- N. Eiron and K. S. McCurley. Analysis of anchor text for web search. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 459--460. ACM Press, 2003. Google ScholarDigital Library
- R. Fagin, R. Kumar, and D. Sivakumar. Efficient similarity search and classification via rank aggregation. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pages 301--312. ACM Press, 2003. Google ScholarDigital Library
- L. Fitzpatrick and M. Dent. Automatic feedback using past queries: social searching? In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 306--313. ACM Press, 1997. Google ScholarDigital Library
- W. B. Frakes and R. Baeza-Yates. Information Retrieval: Data Structures & Algorithms. Prentice Hall, Englewood Cliffs, New Jersey, 1992. Google ScholarDigital Library
- M. Kobayashi and K. Takeda. Information retrieval on the web. ACM Comput. Surv., 32(2):144--173, 2000. Google ScholarDigital Library
- D. Lawrie, W. B. Croft, and A. Rosenberg. Finding topic words for hierarchical summarization. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 349--357. ACM Press, 2001. Google ScholarDigital Library
- W.-H. Lu, L.-F. Chien, and H.-J. Lee. Translation of web queries using anchor text mining. ACM Transactions on Asian Language Information Processing (TALIP), 1(2):159--172, 2002. Google ScholarDigital Library
- O. A. McBryan. GENVL and WWWW: Tools for taming the web. In World Wide Web Conference (WWW'94), Geneva, Switzerland, 1994.Google ScholarCross Ref
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.Google Scholar
- Y. Qiu and H.-P. Frei. Concept-based query expansion. In Proceedings of SIGIR-93, 16th ACM International Conference on Research and Development in Information Retrieval, pages 160--169, Pittsburgh, US, 1993. Google ScholarDigital Library
- C. Silverstein, H. Marais, M. Henzinger, and M. Moricz. Analysis of a very large web search engine query log. SIGIR Forum, 33(1):6--12, 1999. Google ScholarDigital Library
- B. Velez, R. Weiss, M. A. Sheldon, and D. K. Gifford. Fast and effective query refinement. In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 6--15. ACM Press, 1997. Google ScholarDigital Library
- J. Xu and W. B. Croft. Query expansion using local and global document analysis. In Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 4--11, 1996. Google ScholarDigital Library
- J. Zien, J. Meyer, J. Tomlin, and J. Liu. Web query characteristics and their implications on search engines. IBM Research Report, RJ 10199, November 2000.Google Scholar
Index Terms
- Mining anchor text for query refinement
Recommendations
Query reformulation using anchor text
WSDM '10: Proceedings of the third ACM international conference on Web search and data miningQuery reformulation techniques based on query logs have been studied as a method of capturing user intent and improving retrieval effectiveness. The evaluation of these techniques has primarily, however, focused on proprietary query logs and selected ...
A query refinement framework for xml keyword search
Existing work of XML keyword search focus on how to find relevant and meaningful data fragments for a query, assuming each keyword is intended as part of it. However, in XML keyword search, user queries usually contain irrelevant or mismatched terms, ...
Disjunctive Sets of Phrase Queries for Diverse Query Suggestion
WI '19: IEEE/WIC/ACM International Conference on Web IntelligenceThis paper proposes a method of suggesting expanded queries that disambiguate the original Web query which has multiple interpretations. In order to produce a diverse set of queries including those corresponding to infrequent query intents, our method ...
Comments