ABSTRACT
Query ambiguity is a useful metric for search engines to understand users' intents. Existing methods quantify query ambiguity by calculating an entropy of clicks. These methods assign each click to a one-hot vector corresponding to some mutually exclusive groups. However, they cannot incorporate non-obvious structures such as similarity among documents. In this paper, we propose a new approach for quantifying query ambiguity using topic distributions. We show that it is a natural extension of an existing entropy-based method. Further, we use our approach to achieve topic-based extensions of major existing entropy-based methods. Through an evaluation using e-commerce search logs combined with human judgments, our approach successfully extended existing entropy-based methods and improved the quality of query ambiguity measurements.
- R. Artstein and M. Poesio. Inter-coder agreement for computational linguistics. Comput. Linguist., 34(4):555--596, 2008. Google ScholarDigital Library
- P. N. Bennett, K. Svore, and S. T. Dumais. Classification-enhanced ranking. In Proceedings of WWW '10, pages 111--120, 2010. Google ScholarDigital Library
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, 2003. Google ScholarDigital Library
- S. Cronen-Townsend and W. B. Croft. Quantifying query ambiguity. In Proceedings of HLT '02, pages 104--109, 2002. Google ScholarDigital Library
- Z. Dou, R. Song, and J.-R. Wen. A large-scale evaluation and analysis of personalized search strategies. In Proceedings of WWW '07, pages 581--590, 2007. Google ScholarDigital Library
- H. Duan, E. Kiciman, and C. Zhai. Click patterns: An empirical representation of complex query intents. In Proceedings of CIKM '12, pages 1035--1044, 2012. Google ScholarDigital Library
- E. Jones, T. Oliphant, P. Peterson, et al. SciPy: Open source scientific tools for Python, 2001--. {Online; accessed 2016-04--25}.Google Scholar
- T. Kudo, K. Yamamoto, and Y. Matsumoto. Applying conditional random fields to japanese morphological analysis. In Proceedings of EMNLP '04, pages 230--237, 2004.Google Scholar
- G. Qiu, K. Liu, J. Bu, C. Chen, and Z. Kang. Quantify query ambiguity using odp metadata. In Proceedings of SIGIR '07, pages 697--698, 2007. Google ScholarDigital Library
- R.v Rehurek and P. Sojka. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pages 45--50, 2010.Google Scholar
- R. L. Santos, C. Macdonald, and I. Ounis. Selectively diversifying web search results. In Proceedings of CIKM '10, pages 1179--1188, 2010. Google ScholarDigital Library
- R. Song, Z. Luo, J.-R. Wen, Y. Yu, and H.-W. Hon. Identifying ambiguous queries in web search. In Proceedings of WWW '07, pages 1169--1170, 2007. Google ScholarDigital Library
- Y. Wang and E. Agichtein. Query ambiguity revisited: clickthrough measures for distinguishing informational and ambiguous queries. In Proceedings of HLT '10, pages 361--364, 2010. Google ScholarDigital Library
Index Terms
- Quantifying Query Ambiguity with Topic Distributions
Recommendations
Intent-aware query similarity
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementQuery similarity calculation is an important problem and has a wide range of applications in IR, including query recommendation, query expansion, and even advertisement matching. Existing work on query similarity aims to provide a single similarity ...
Predicting query reformulation type from user behavior
SAC '13: Proceedings of the 28th Annual ACM Symposium on Applied ComputingThis paper proposes a method to discover how a user's search intent changes using his/her behavior during a Web search. A Web search user has a particular search intent and formulates search queries according to that intent. It is, however, a difficult ...
Rank-Integrated Topic Modeling: A General Framework
Web and Big DataAbstractRank-integrated topic models which incorporate link structures into topic modeling through topical ranking have shown promising performance comparing to other link combined topic models. However, existing work on rank-integrated topic modeling ...
Comments