skip to main content
10.1145/1458082.1458177acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Learning latent semantic relations from clickthrough data for query suggestion

Authors Info & Claims
Published:26 October 2008Publication History

ABSTRACT

For a given query raised by a specific user, the Query Suggestion technique aims to recommend relevant queries which potentially suit the information needs of that user. Due to the complexity of the Web structure and the ambiguity of users' inputs, most of the suggestion algorithms suffer from the problem of poor recommendation accuracy. In this paper, aiming at providing semantically relevant queries for users, we develop a novel, effective and efficient two-level query suggestion model by mining clickthrough data, in the form of two bipartite graphs (user-query and query-URL bipartite graphs) extracted from the clickthrough data. Based on this, we first propose a joint matrix factorization method which utilizes two bipartite graphs to learn the low-rank query latent feature space, and then build a query similarity graph based on the features. After that, we design an online ranking algorithm to propagate similarities on the query similarity graph, and finally recommend latent semantically relevant queries to users. Experimental analysis on the clickthrough data of a commercial search engine shows the effectiveness and the efficiency of our method.

References

  1. E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 19--26, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Baeza-Yates and A. Tiberi. Extracting semantic relations from query logs. In KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 76--85, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. A. Baeza-Yates, C. A. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In EDBT Workshops, pages 588--596, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. In KDD '00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 407--416, New York, NY, USA, 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373--1396, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. A. Chirita, C. S. Firan, and W. Nejdl. Personalized query expansion for the web. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 7--14, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma. Query expansion by mining user logs. IEEE Trans. Knowl. Data Eng., 15(4):829--839, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Dupret and M. Mendoza. Automatic query recommendation using click-through data. In IFIP PPAI, pages 303--312, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  9. N. Eiron, K. S. McCurley, and J. A. Tomlin. Ranking the web frontier. In WWW '04: Proceedings of the 13th international conference on World Wide Web, pages 309--318, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Gao, C. Niu, J.-Y. Nie, M. Zhou, J. Hu, K.-F. Wong, and H.-W.Hon. Cross-lingual query suggestion using query logs of different languages. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 463--470, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Gleich and L. Zhukov. Svd subspace projections for term suggestion ranking and clustering. In Technical Report of Yahoo! Research Labs, 2004.Google ScholarGoogle Scholar
  12. B. J. Jansen, A. Spink, J. Bateman, and T. Saracevic. Real life information retrieval: A study of user queries on the web. SIGIR Forum, 32(1):5--17, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Jeh and J. Widom. SimRank: a measure of structural-context similarity. In KDD '02: Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 538--543, New York, NY, USA, 2002. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Jeon, W. B. Croft, and J. H. Lee. Finding similar questions in large question and answer archives. In CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management, pages 84--90, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Joachims. Optimizing search engines using clickthrough data. In KDD '02: Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 133--142, New York, NY, USA, 2002. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Joachims and F. Radlinski. Search engines that learn from implicit feedback. Computer, 40(8):34--40, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In WWW '06: Proceedings of the 15th international conference on World Wide Web, pages 387--396, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. I. Kondor and J. D. Lafferty. Diffusion kernels on graphs and other discrete input spaces. In ICML '02: Proceedings of the 19th International Conference on Machine Learning, pages 315--322, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Kraft and J. Zien. Mining anchor text for query refinement. In WWW '04: Proceedings of the 13th international conference on World Wide Web, pages 666--674, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. D. Lafferty and G. Lebanon. Diffusion kernels on statistical manifolds. Journal of Machine Learning Research, 6:129--163, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. In Technical Report Paper SIDL-WP-1999-0120 (version of 11/11/1999), 1999.Google ScholarGoogle Scholar
  22. M. Pasca and B. V. Durme. What you seek is what you get: Extraction of class attributes from query logs. In IJCAI '07: Proceedings of International Joint Conferences on Artificial Intelligence, pages 2832--2837, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Pass, A. Chowdhury, and C. Torgeson. A picture of search. In The First International Conference on Scalable Information Systems, Hong Kong, June, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Shen, M. Qin, W. Chen, Q. Yang, and Z. Chen. Mining web query hierarchies from clickthrough data. In AAAI '07: Proceedings of the 22th Conference on Artificial Intelligence, pages 341--346, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Silverstein, M. R. Henzinger, H. Marais, and M. Moricz. Analysis of a very large web search engine query log. SIGIR Forum, 33(1):6--12, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J.-T. Sun, D. Shen, H.-J. Zeng, Q. Yang, Y. Lu, and Z. Chen. Web-page summarization using clickthrough data. In SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 194--201, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Theobald, R. Schenkel, and G. Weikum. Efficient and self-tuning incremental query expansion for top-k query processing. In SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 242--249, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. Vélez, R. Weiss, M. A. Sheldon, and D. K. Gifford. Fast and effective query refinement. SIGIR Forum, 31(SI):6--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. X. Wang and C. Zhai. Learn from web search logs to organize search results. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 87--94, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J.-R. Wen, J.-Y. Nie, and H. Zhang. Query clustering using user logs. ACM Trans. Inf. Syst., 20(1):59--81, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Xu and W. B. Croft. Query expansion using local and global document analysis. In SIGIR '96: Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, pages 4--11, New York, NY, USA, 1996. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Yang, I. King, and M. R. Lyu. DiffusionRank: a possible penicillin for web spamming. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 431--438, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. D. Zhou, S. Zhu, K. Yu, X. Song, B. L. Tseng, H. Zha, and C. L.Giles. Learning multiple graphs for document recommendations. In WWW '08: Proceedings of the 17th international conference on World Wide Web, pages 141--150, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Zhu, K. Yu, Y. Chi, and Y. Gong. Combining content and link for classification using matrix factorization. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 487--494, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning latent semantic relations from clickthrough data for query suggestion

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management
      October 2008
      1562 pages
      ISBN:9781595939913
      DOI:10.1145/1458082

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 October 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader