ABSTRACT
User interactions with search engines reveal three main underlying intents, namely navigational, informational, and transactional. By providing more accurate results depending on such query intents the performance of search engines can be greatly improved. Therefore, query classification has been an active research topic for the last years. However, while query topic classification has deserved a specific bakeoff, no evaluation campaign has been devoted to the study of automatic query intent detection. In this paper some of the available query intent detection techniques are reviewed, an evaluation framework is proposed, and it is used to compare those methods in order to shed light on their relative performance and drawbacks. As it will be shown, manually prepared gold-standard files are much needed, and traditional pooling is not the most feasible evaluation method. In addition to this, future lines of work in both query intent detection and its evaluation are proposed.
- Baeza-Yates, R., Calderon-Benavides, L., and Gonzalez-Caro, C. The Intention Behind Web Queries. Lecture Notes in Computer Science 4209, (2006), 98. Google ScholarDigital Library
- Bailey, P., Craswell, N., Soboroff, I., Thomas, P., de Vries, A. P., and Yilmaz, E. Relevance assessment: are judges exchangeable and does it matter. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, (2008), 667--674. Google ScholarDigital Library
- Beitzel, S. M., Jensen, E. C., Chowdhury, A., and Frieder, O. Varying approaches to topical web query classification. Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, (2007), 783--784. Google ScholarDigital Library
- Beitzel, S. M., Jensen, E. C., Frieder, O., Lewis, D. D., Chowdhury, A., and Kolcz, A. Improving automatic query classification via semi-supervised learning. Proceedings of the Fifth IEEE International Conference on Data Mining, (2005), 42--49. Google ScholarDigital Library
- Brenes, D. J., and Gayo-Avello, D. Automatic detection of navigational queries according to Behavioural Characteristics. LWA 2008 Workshop Proceedings, (2008), 41--48.Google Scholar
- Broder, A. A taxonomy of web search. ACM SIGIR Forum 36, 2 (2002), 3--10. Google ScholarDigital Library
- Broder, A. Z., Fontoura, M., Gabrilovich, E., Joshi, A., Josifovski, V., and Zhang, T. Robust classification of rare queries using web knowledge. Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, (2007), 231--238. Google ScholarDigital Library
- Buckley, C., Dimmick, D., Soboroff, I., and Voorhees, E. Bias and the limits of pooling for large collections. Information Retrieval 10, 6 (2007), 491--508. Google ScholarDigital Library
- Buzikashvili, N. Sliding window technique for the web log analysis. Proceedings of the 16th international conference on World Wide Web, (2007), 1213--1214. Google ScholarDigital Library
- Dai, H. K., Zhao, L., Nie, Z., Wen, J. R., Wang, L., and Li, Y. Detecting online commercial intention (OCI). Proceedings of the 15th international conference on World Wide Web, (2006), 829--837. Google ScholarDigital Library
- Flanagan, D. MQL Reference Guide, (2008). Available at: http://mql.freebaseapps.com/ (Accessed 24 November 2008)Google Scholar
- Gravano, L., Hatzivassiloglou, V., and Lichtenstein, R. Categorizing web queries according to geographical locality. Proceedings of the twelfth international conference on Information and knowledge management, (2003), 325--333. Google ScholarDigital Library
- Jansen, B. J., Booth, D. L., and Spink, A. Determining the informational, navigational, and transactional intent of Web queries. Information Processing and Management 44, 3 (2008), 1251--1266. Google ScholarDigital Library
- Jones, K. S. and van Rijsbergen, C. Report on the need for and provision of an "ideal" information retrieval test collection. British Library Research and Development, 1975. Cited by Voorhees, E. M. and Harman, D. The text retrieval conferences (TRECS). Proceedings of a workshop on held at Baltimore, Maryland: October 13--15, 1998, Association for Computational Linguistics Morristown, NJ, USA (1998), 241--273.Google Scholar
- Kang, I. H. and Kim, G. C. Query type classification for web document retrieval. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, (2003), 64--71. Google ScholarDigital Library
- Kittur, A., Chi, E., and Suh, B. Crowdsourcing user studies with Mechanical Turk. Proceedings of the 26th annual SIGCHI conference on Human Factors in Computing Systems, (2008), 453--456. Google ScholarDigital Library
- Lee, U., Liu, Z., and Cho, J. Automatic identification of user goals in Web search. Proceedings of the 14th international conference on World Wide Web, (2005), 391--400. Google ScholarDigital Library
- Li, X., Wang, Y. Y., and Acero, A. Learning query intent from regularized click graphs. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, (2008), 339--346. Google ScholarDigital Library
- Li, Y., Krishnamurthy, R., Vaithyanathan, S., and Jagadish, H. V. Getting work done on the web: supporting transactional queries. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, (2006), 557--564. Google ScholarDigital Library
- Li, Y., Zheng, Z., and Dai, H. K. KDD CUP-2005 report: facing a great challenge. ACM SIGKDD Explorations Newsletter 7, 2 (2005), 91--99. Google ScholarDigital Library
- Liu, Y., Zhang, M., Ru, L., and Ma, S. Automatic Query Type Identification Based on Click Through Information. Lecture Notes in Computer Science 4182, (2006), 593--600. Google ScholarDigital Library
- Microsoft. Microsoft Research Microsoft Live Labs: Accelerating Search in Academic Research 2006, Request for Proposals, (2006). Available at: http://research.microsoft.com/ur/us/fundingopps/RFPs/Search_2006_RFP.aspx (accessed 24 November 2008).Google Scholar
- Nettleton, D., Calderon, L., and Baeza-Yates, R. Analysis of Web Search Engine Query Sessions. Proc. of WebKDD, (2006), 20--23. Google ScholarDigital Library
- Pu, H. T., Chuang, S. L., and Yang, C. Subject categorization of query terms for exploring Web users' search interests. Journal of the American Society for Information Science and Technology 53, 8 (2002), 617--630. Google ScholarDigital Library
- Rose, D. E. and Levinson, D. Understanding user goals in web search. Proceedings of the 13th international conference on World Wide Web, (2004), 13--19. Google ScholarDigital Library
- Shen, D., Pan, R., Sun, J. T., et al. Query enrichment for web-query classification. ACM Transactions on Information Systems (TOIS) 24, 3 (2006), 320--352. Google ScholarDigital Library
- Shen, D., Sun, J. T., Yang, Q., and Chen, Z. Building bridges for web query classification. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, (2006), 131--138. Google ScholarDigital Library
- Spärck-Jones, K. Automatic indexing, Journal of Documentation 30, (1974), 393--432.Google ScholarCross Ref
- Spink, A., Wolfram, D., Jansen, M. B. J., and Saracevic, T. Searching the web: The public and their queries. Journal of the American Society for Information Science and Technology 52, 3 (2001), 226--234. Google ScholarDigital Library
- Tamine, L., Daoud, M., Dinh, B. D., and Boughanem, M. Contextual query classification in web search. LWA 2008 Workshop Proceedings, (2008), 65--68.Google Scholar
- Taylor, A. R., Cool, C., Belkin, N. J., and Amadio, W. J. Relationships between categories of relevance criteria and stage in task completion. Information Processing & Management 43, 4 (2007), 1071--1084. Google ScholarDigital Library
- Wang, L., Wang, C., Xie, X., et al. Detecting dominant locations from search queries. Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, (2005), 424--431. Google ScholarDigital Library
- Yilmaz, E. and Aslam, J. A. Estimating average precision with incomplete and imperfect judgments. Proceedings of the 15th ACM international conference on Information and knowledge management, ACM Press New York, NY, USA (2006), 102--111. Google ScholarDigital Library
- Zhang, Y. and Moffat, A. Separating Human and Non-Human Web Queries. Web Information Seeking and Interaction, (2007), 13--16.Google Scholar
Index Terms
- Survey and evaluation of query intent detection methods
Recommendations
Learning query intent from regularized click graphs
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrievalThis work presents the use of click graphs in improving query intent classifiers, which are critical if vertical search and general-purpose search services are to be offered in a unified user interface. Previous works on query classification have ...
Characterizing commercial intent
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementUnderstanding the intent underlying user's queries may help personalize search results and therefore improve user satisfaction. We develop a methodology for using the content of search engine result pages (SERPs) along with the information obtained from ...
The influence of commercial intent of search results on their perceived relevance
iConference '11: Proceedings of the 2011 iConferenceWe carried out a retrieval effectiveness test on the three major web search engines (i.e., Google, Microsoft and Yahoo). In addition to relevance judgments, we classified the results according to their commercial intent and whether or not they carried ...
Comments