|
ABSTRACT
We introduce the notion of query substitution, that is, generating a new query to replace a user's original search query. Our technique uses modifications based on typical substitutions web searchers make to their queries. In this way the new query is strongly related to the original query, containing terms closely related to all of the original terms. This contrasts with query expansion through pseudo-relevance feedback, which is costly and can lead to query drift. This also contrasts with query relaxation through boolean or TFIDF retrieval, which reduces the specificity of the query. We define a scale for evaluating query substitution, and show that our method performs well at generating new queries related to the original queries. We build a model for selecting between candidates, by using a number of features relating the query-candidate pair, and by fitting the model to human judgments of relevance of query suggestions. This further improves the quality of the candidates generated. Experiments show that our techniques significantly increase coverage and effectiveness in the setting of sponsored search.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
C.-C. Chang and C.-J. Lin. LIBSVM : A Library for Support Vector Machines, 2001. Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.
|
| |
4
|
S. Cucerzan and E. Brill. Spelling correction as an iterative process that exploits the collective knowledge of web users. In Proceedings of EMNLP 2004, pages 293--300, 2004.
|
| |
5
|
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
|
| |
6
|
L. Dumbgen. Pair-adjacent violators (PAV), available at http://www.math.mu-luebeck.de/workers/duembgen/software/software.html. In Statistical Software (MATLAB), 2000.
|
| |
7
|
|
| |
8
|
D. C. Fain and J. O. Pedersen. Sponsored search. In Bulletin of the American Society for Information Science and Technology, 2005.
|
| |
9
|
C. Fellbaum. WordNet: An Electronic Lexical Database. The MIT Press, 1998.
|
 |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
|
| |
14
|
J. C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In A. Smola, P. Bartlett, B. Schlkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 61--74. MIT Press, 1999.
|
 |
15
|
|
| |
16
|
K. M. Risvik, T. Mikolajewski, and P. Boros. Query segmentation for web search. In Poster Session in The Twelfth International World Wide Web Conference, 2003.
|
 |
17
|
|
| |
18
|
A. Spink, B. J. Jansen, and H. C. Ozmultu. Use of query reformulation and relevance feedback by Excite users. Internet Research: Electronic Networking Applications and Policy, 10(4):317--328, 2000.
|
 |
19
|
|
CITED BY 23
|
|
|
|
|
|
|
|
Ben Carterette , Rosie Jones , Wiley Greiner , Cory Barr, N semantic classes are harder than two, Proceedings of the COLING/ACL on Main conference poster sessions, p.49-56, July 17-18, 2006, Sydney, Australia
|
|
Ravi Kumar , Jasmine Novak , Bo Pang , Andrew Tomkins, On anonymizing query logs via token-based hashing, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
Filip Radlinski , Andrei Broder , Peter Ciccolo , Evgeniy Gabrilovich , Vanja Josifovski , Lance Riedel, Optimizing relevance and revenue in ad search: a query substitution approach, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
|
|
Azarakhsh Malekian , Chi-Chao Chang , Ravi Kumar , Grant Wang, Optimizing query rewrites for keyword-based advertising, Proceedings of the 9th ACM conference on Electronic commerce, July 08-12, 2008, Chicago, Il, USA
|
|
|
|
|
|
|
|
|
|
|
|
Yunbo Cao , Huizhong Duan , Chin-Yew Lin , Yong Yu , Hsiao-Wuen Hon, Recommending questions using the mdl-based tree cut model, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
|
|
|
|
|
|
|