research-article

Ranking specialization for web search: a divide-and-conquer approach by using topical RankSVM

Authors:

Hongyuan ZhaAuthors Info & Claims

WWW '10: Proceedings of the 19th international conference on World wide web

Pages 131 - 140

https://doi.org/10.1145/1772690.1772705

Published: 26 April 2010 Publication History

Abstract

Many ranking algorithms applying machine learning techniques have been proposed in informational retrieval and Web search. However, most of existing approaches do not explicitly take into account the fact that queries vary significantly in terms of ranking and entail different treatments regarding the ranking models. In this paper, we apply a divide-and-conquer framework for ranking specialization, i.e. learning multiple ranking models by addressing query difference. We first generate query representation by aggregating ranking features through pseudo feedbacks, and employ unsupervised clustering methods to identify a set of ranking-sensitive query topics based on training queries. To learn multiple ranking models for respective ranking-sensitive query topics, we define a global loss function by combining the ranking risks of all query topics, and we propose a unified SVM-based learning process to minimize the global loss. Moreover, we employ an ensemble approach to generate the ranking result for each test query by applying a set of ranking models of the most appropriate query topics. We conduct experiments using a benchmark dataset for learning ranking functions as well as a dataset from a commercial search engine. Experimental results show that our proposed approach can significantly improve the ranking performance over existing single-model approaches as well as straightforward local ranking approaches, and the automatically identified ranking-sensitive topics are more useful for enhancing ranking performance than pre-defined query categorization.

References

[1]

Letor dataset website. http://research.microsoft.com/enus/um/beijing/projects/letor/.

[2]

R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, 1999.

Digital Library

[3]

S. Beitzel, E. Jensen, A. Chowdhury, and O. Frieder. Varying approaches to topical web query classification. In Proc. of SIGIR, 2007.

Digital Library

[4]

B. Bolstad, R. Irizarry, M. Astrand, and T. Speed. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 19:185--193, 2003.

[5]

A. Broder. A taxonomy of web search. SIGIR Forum, 2002.

Digital Library

[6]

C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proc. of ICML, 2005.

Digital Library

[7]

Z. Cao, T. Qin, T. Liu, M. Tsai, and H. Li. Learning to rank: from pairwise approach to listwise approach. In Proc. of ICML, 2007.

Digital Library

[8]

Y. Freund, R. Iyer, R. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. In Journal of JMLR, 2003.

Digital Library

[9]

X. Geng, T.-Y. Liu, T. Qin, A. Arnold, H. Li, and H.-Y. Shum. Query dependent ranking using k-nearest neighbor. In Proc. of SIGIR, 2008.

Digital Library

[10]

R. Herbrich, T. Graepel, and K. Obermayer. Support vector learning for ordinal regression. In Proc. of ICANN, 1999.

[11]

K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. In ACM Transactions on Information Retrieval, 2002.

Digital Library

[12]

T. Joachims. Optimizing search engines using clickthrough data. In Proc. of KDD, 2002.

Digital Library

[13]

I. Kang and G. Kim. Query type classification for web document retrieval. In Proc. of SIGIR, 2003.

Digital Library

[14]

J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In Proc. of SIGIR, 2001.

Digital Library

[15]

P. Li, B. Christopher, and Q. Wu. Mcrank: Learning to rank using multiple classification and gradient boosting. In Proc. of NIPS, 2007.

[16]

T.-Y. Liu, J. Xu, T. Qin, W. Xiong, and H. Li. Letor: Benchmark dataset for research on learning to rank for information retrieval. In Proc. of SIGIR, 2007.

[17]

S. Robertson. Overview of the okapi projects. In Journal of Documentation, 1998.

[18]

G. Salton and M. E. Lesk. Computer evaluation of indexing and text processing. In Journal of ACM, 1968.

Digital Library

[19]

D. Shen, J. Sun, Q. Yang, and Z. Chen. Building bridges for web query classification. In Proc. of SIGIR, 2006.

Digital Library

[20]

R. Tibshirani, G. Walther, and T. Hastie. Estimating the number of clusters in a dataset via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63:411--423, 2000.

[21]

E. Voorhees and D. Harman. Trec: Experiment and evaluation in information retrieval. In MIT Press, 2005.

Digital Library

[22]

H. Zha, Z. Zheng, H. Fu, and G. Sun. Incorporating query difference for learning retrieval functions in information retrieval. In Proc. CIKM, 2006.

Digital Library

[23]

Z. Zheng, H. Zha, K. Chen, and G. Sun. A regression framework for learning ranking functions using relative relevance judgments. In Proc. of SIGIR, 2007.

Digital Library

[24]

Z. Zheng, H. Zha, and G. Sun. Query-level learning to rank using isotonic regression. In Proc. of the 46th Allerton Conf. on Comm., Control and Computing, 2008.

Cited By

Li MZhou XQin SBin ZWang Y(2023)Improved RAkEL’s Fault Diagnosis Method for High-Speed Train Traction TransformerSensors10.3390/s2319806723:19(8067)Online publication date: 25-Sep-2023
https://doi.org/10.3390/s23198067
Usta AAltingovde IOzcan RUlusoy O(2021)Learning to Rank for Educational Search EnginesIEEE Transactions on Learning Technologies10.1109/TLT.2021.307519614:2(211-225)Online publication date: 1-Apr-2021
https://doi.org/10.1109/TLT.2021.3075196
Liu BLiu ZXiao Y(2021)A new dictionary-based positive and unlabeled learning methodApplied Intelligence10.1007/s10489-021-02344-zOnline publication date: 14-Apr-2021
https://doi.org/10.1007/s10489-021-02344-z
Show More Cited By

Index Terms

Ranking specialization for web search: a divide-and-conquer approach by using topical RankSVM
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
  2. Information systems applications

Recommendations

Identifying popular search goals behind search queries to improve web search ranking
AIRS'11: Proceedings of the 7th Asia conference on Information Retrieval Technology

Web users usually have a certain search goal before they submit a search query. However, many laypersons can't transform their search goals into suitable queries. Thus, understanding original search goals behind a query is very important for search ...
Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

In this paper, we propose a new idea called ranking consistency in web search. Relevance ranking is one of the biggest problems in creating an effective web search system. Given some queries with similar search intents, conventional approaches typically ...
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '10: Proceedings of the 19th international conference on World wide web

April 2010

1407 pages

ISBN:9781605587998

DOI:10.1145/1772690

General Chairs:
Michael Rappa
North Carolina State University, USA
,
Paul Jones
University of North Carolina at Chapel Hill, USA
,
Program Chairs:
Juliana Freire
University of Utah, USA
,
Soumen Chakrabarti
Indian Institute of Technology, India

Copyright © 2010 International World Wide Web Conference Committee (IW3C2).

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 April 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '10

WWW '10: The 19th International World Wide Web Conference

April 26 - 30, 2010

North Carolina, Raleigh, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
826
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)1

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li MZhou XQin SBin ZWang Y(2023)Improved RAkEL’s Fault Diagnosis Method for High-Speed Train Traction TransformerSensors10.3390/s2319806723:19(8067)Online publication date: 25-Sep-2023
https://doi.org/10.3390/s23198067
Usta AAltingovde IOzcan RUlusoy O(2021)Learning to Rank for Educational Search EnginesIEEE Transactions on Learning Technologies10.1109/TLT.2021.307519614:2(211-225)Online publication date: 1-Apr-2021
https://doi.org/10.1109/TLT.2021.3075196
Liu BLiu ZXiao Y(2021)A new dictionary-based positive and unlabeled learning methodApplied Intelligence10.1007/s10489-021-02344-zOnline publication date: 14-Apr-2021
https://doi.org/10.1007/s10489-021-02344-z
Sayed MOard DPiwowarski BChevalier MGaussier EMaarek YNie JScholer F(2019)Jointly Modeling Relevance and Sensitivity for Search Among Sensitive ContentProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331256(615-624)Online publication date: 18-Jul-2019
https://dl.acm.org/doi/10.1145/3331184.3331256
Nguyen TKanhabua NNejdl W(2018)Multiple Models for Recommending Temporal Aspects of EntitiesThe Semantic Web10.1007/978-3-319-93417-4_30(462-480)Online publication date: 3-Jun-2018
https://doi.org/10.1007/978-3-319-93417-4_30
Gong LHaines BWang HBarrett RCummings RAgichtein EGabrilovich E(2017)Clustered Model Adaption for Personalized Sentiment AnalysisProceedings of the 26th International Conference on World Wide Web10.1145/3038912.3052693(937-946)Online publication date: 3-Apr-2017
https://dl.acm.org/doi/10.1145/3038912.3052693
Chen JZheng HXiao XSangaiah AJiang YZhao C(2017)Tianji: Implementation of an Efficient Tracking Engine in the Mobile Internet EraIEEE Access10.1109/ACCESS.2017.27360645(16592-16600)Online publication date: 2017
https://doi.org/10.1109/ACCESS.2017.2736064
Zoghi MTunys TLi LJose DChen JChin Cde Rijke MPerego RSebastiani FAslam JRuthven IZobel J(2016)Click-based Hot Fixes for Underperforming Torso QueriesProceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval10.1145/2911451.2911500(195-204)Online publication date: 7-Jul-2016
https://dl.acm.org/doi/10.1145/2911451.2911500
Tran TNiederee CKanhabua NGadiraju UAnand ABailey JMoffat AAggarwal Cde Rijke MKumar RMurdock VSellis TYu J(2015)Balancing Novelty and SalienceProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806486(1201-1210)Online publication date: 17-Oct-2015
https://dl.acm.org/doi/10.1145/2806416.2806486
Jameel SLam WSchockaert SBing LBailey JMoffat AAggarwal Cde Rijke MKumar RMurdock VSellis TYu J(2015)A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-RankProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806482(103-112)Online publication date: 17-Oct-2015
https://dl.acm.org/doi/10.1145/2806416.2806482
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

EPUB

View this article in ePub.

Figures

Tables

Media

View Table of Conten