ABSTRACT
For a given query raised by a specific user, the Query Suggestion technique aims to recommend relevant queries which potentially suit the information needs of that user. Due to the complexity of the Web structure and the ambiguity of users' inputs, most of the suggestion algorithms suffer from the problem of poor recommendation accuracy. In this paper, aiming at providing semantically relevant queries for users, we develop a novel, effective and efficient two-level query suggestion model by mining clickthrough data, in the form of two bipartite graphs (user-query and query-URL bipartite graphs) extracted from the clickthrough data. Based on this, we first propose a joint matrix factorization method which utilizes two bipartite graphs to learn the low-rank query latent feature space, and then build a query similarity graph based on the features. After that, we design an online ranking algorithm to propagate similarities on the query similarity graph, and finally recommend latent semantically relevant queries to users. Experimental analysis on the clickthrough data of a commercial search engine shows the effectiveness and the efficiency of our method.
- E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 19--26, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- R. Baeza-Yates and A. Tiberi. Extracting semantic relations from query logs. In KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 76--85, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- R. A. Baeza-Yates, C. A. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In EDBT Workshops, pages 588--596, 2004. Google ScholarDigital Library
- D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. In KDD '00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 407--416, New York, NY, USA, 2000. ACM. Google ScholarDigital Library
- M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373--1396, 2003. Google ScholarDigital Library
- P. A. Chirita, C. S. Firan, and W. Nejdl. Personalized query expansion for the web. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 7--14, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma. Query expansion by mining user logs. IEEE Trans. Knowl. Data Eng., 15(4):829--839, 2003. Google ScholarDigital Library
- G. Dupret and M. Mendoza. Automatic query recommendation using click-through data. In IFIP PPAI, pages 303--312, 2006.Google ScholarCross Ref
- N. Eiron, K. S. McCurley, and J. A. Tomlin. Ranking the web frontier. In WWW '04: Proceedings of the 13th international conference on World Wide Web, pages 309--318, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- W. Gao, C. Niu, J.-Y. Nie, M. Zhou, J. Hu, K.-F. Wong, and H.-W.Hon. Cross-lingual query suggestion using query logs of different languages. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 463--470, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- D. Gleich and L. Zhukov. Svd subspace projections for term suggestion ranking and clustering. In Technical Report of Yahoo! Research Labs, 2004.Google Scholar
- B. J. Jansen, A. Spink, J. Bateman, and T. Saracevic. Real life information retrieval: A study of user queries on the web. SIGIR Forum, 32(1):5--17, 1998. Google ScholarDigital Library
- G. Jeh and J. Widom. SimRank: a measure of structural-context similarity. In KDD '02: Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 538--543, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
- J. Jeon, W. B. Croft, and J. H. Lee. Finding similar questions in large question and answer archives. In CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management, pages 84--90, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- T. Joachims. Optimizing search engines using clickthrough data. In KDD '02: Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 133--142, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
- T. Joachims and F. Radlinski. Search engines that learn from implicit feedback. Computer, 40(8):34--40, 2007. Google ScholarDigital Library
- R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In WWW '06: Proceedings of the 15th international conference on World Wide Web, pages 387--396, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- R. I. Kondor and J. D. Lafferty. Diffusion kernels on graphs and other discrete input spaces. In ICML '02: Proceedings of the 19th International Conference on Machine Learning, pages 315--322, 2002. Google ScholarDigital Library
- R. Kraft and J. Zien. Mining anchor text for query refinement. In WWW '04: Proceedings of the 13th international conference on World Wide Web, pages 666--674, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- J. D. Lafferty and G. Lebanon. Diffusion kernels on statistical manifolds. Journal of Machine Learning Research, 6:129--163, 2005. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. In Technical Report Paper SIDL-WP-1999-0120 (version of 11/11/1999), 1999.Google Scholar
- M. Pasca and B. V. Durme. What you seek is what you get: Extraction of class attributes from query logs. In IJCAI '07: Proceedings of International Joint Conferences on Artificial Intelligence, pages 2832--2837, 2007. Google ScholarDigital Library
- G. Pass, A. Chowdhury, and C. Torgeson. A picture of search. In The First International Conference on Scalable Information Systems, Hong Kong, June, 2006. Google ScholarDigital Library
- D. Shen, M. Qin, W. Chen, Q. Yang, and Z. Chen. Mining web query hierarchies from clickthrough data. In AAAI '07: Proceedings of the 22th Conference on Artificial Intelligence, pages 341--346, 2007. Google ScholarDigital Library
- C. Silverstein, M. R. Henzinger, H. Marais, and M. Moricz. Analysis of a very large web search engine query log. SIGIR Forum, 33(1):6--12, 1999. Google ScholarDigital Library
- J.-T. Sun, D. Shen, H.-J. Zeng, Q. Yang, Y. Lu, and Z. Chen. Web-page summarization using clickthrough data. In SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 194--201, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- M. Theobald, R. Schenkel, and G. Weikum. Efficient and self-tuning incremental query expansion for top-k query processing. In SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 242--249, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- B. Vélez, R. Weiss, M. A. Sheldon, and D. K. Gifford. Fast and effective query refinement. SIGIR Forum, 31(SI):6--15. Google ScholarDigital Library
- X. Wang and C. Zhai. Learn from web search logs to organize search results. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 87--94, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- J.-R. Wen, J.-Y. Nie, and H. Zhang. Query clustering using user logs. ACM Trans. Inf. Syst., 20(1):59--81, 2002. Google ScholarDigital Library
- J. Xu and W. B. Croft. Query expansion using local and global document analysis. In SIGIR '96: Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, pages 4--11, New York, NY, USA, 1996. ACM. Google ScholarDigital Library
- H. Yang, I. King, and M. R. Lyu. DiffusionRank: a possible penicillin for web spamming. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 431--438, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- D. Zhou, S. Zhu, K. Yu, X. Song, B. L. Tseng, H. Zha, and C. L.Giles. Learning multiple graphs for document recommendations. In WWW '08: Proceedings of the 17th international conference on World Wide Web, pages 141--150, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- S. Zhu, K. Yu, Y. Chi, and Y. Gong. Combining content and link for classification using matrix factorization. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 487--494, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
Index Terms
- Learning latent semantic relations from clickthrough data for query suggestion
Recommendations
Visual query suggestion
MM '09: Proceedings of the 17th ACM international conference on MultimediaQuery suggestion is an effective approach to improve the usability of image search. Most existing search engines are able to automatically suggest a list of textual query terms based on users' current query input, which can be called Textual Query ...
Mining Web search engines for query suggestion
Queries to Web search engines are usually short and ambiguous, which provides insufficient information needs of users for effectively retrieving relevant Web pages. To address this problem, query suggestion is implemented by most search engines. However,...
When do people use query suggestion? A query suggestion log analysis
AbstractQuery suggestion, which enables the user to revise a query with a single click, has become one of the most fundamental features of Web search engines. However, it has not been clear what circumstances cause the user to turn to query suggestion. In ...
Comments