|
ABSTRACT
Inspired by the PageRank and HITS (hubs and authorities) algorithms for Web search, we propose a structural re-ranking approach to ad hoc information retrieval: we reorder the documents in an initially retrieved set by exploiting asymmetric relationships between them. Specifically, we consider generation links, which indicate that the language model induced from one document assigns high probability to the text of another; in doing so, we take care to prevent bias against long documents. We study a number of re-ranking criteria based on measures of centrality in the graphs formed by generation links, and show that integrating centrality into standard language-model-based retrieval is quite effective at improving precision at top ranks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
|
| |
4
|
Güneş Erkan and Dragomir R. Radev. LexRank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22:457--479, 2004.
|
| |
5
|
Eugene Garfield. Citation analysis as a tool in journal evaluation. Science, 178:471--479, 1972.
|
| |
6
|
Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press, third edition, 1996.
|
| |
7
|
Winfried K. Grassmann, Michael I. Taksar, and Daniel P. Heyman. Regenerative analysis and steady state distributions for Markov chains. Operations Research, 33(5):1107--1116, 1985.
|
| |
8
|
Geoffrey R. Grimmett and David R. Stirzaker. Probability and Random Processes. Oxford Science Publications, third edition, 2001.
|
| |
9
|
|
 |
10
|
|
| |
11
|
Djoerd Hiemstra and Wessel Kraaij. Twenty-One at TREC7: Ad hoc and cross-language track. In Proceedings of the Seventh Text Retrieval Conference (TREC-7), pages 227--238, 1999.
|
| |
12
|
Thorsten Joachims. Transductive learning via spectral graph partitioning. In Proceedings of ICML, 2003.
|
 |
13
|
|
| |
14
|
Wessel Kraaij and Thijs Westerveld. TNO-UT at TREC9: How different are web documents? In Proceedings of the Ninth Text Retrieval Conference (TREC-9), pages 665--671, 2001.
|
 |
15
|
|
 |
16
|
|
 |
17
|
|
 |
18
|
John Lafferty , Chengxiang Zhai, Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.111-119, September 2001, New Orleans, Louisiana, United States
[doi> 10.1145/383952.383970]
|
| |
19
|
Victor Lavrenko, James Allan, Edward DeGuzman, Daniel LaFlamme, Veera Pollard, and Steven Thomas. Relevance models for topic detection and tracking. In Proceedings of the Human Language Technology Conference (HLT), pages 104--110, 2002.
|
 |
20
|
|
 |
21
|
|
 |
22
|
|
| |
23
|
Rada Mihalcea. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In The Companion Volume to the Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, pages 170--173, 2004.
|
| |
24
|
Rada Mihalcea and Paul Tarau. TextRank: Bringing order into texts. In Proceedings of EMNLP, pages 404--411, 2004. Poster.
|
 |
25
|
David R. H. Miller , Tim Leek , Richard M. Schwartz, A hidden Markov model information retrieval system, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.214-221, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312680]
|
| |
26
|
Kenney Ng. A maximum likelihood ratio information retrieval model. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), pages 483--492, 2000.
|
| |
27
|
Paul Ogilvie and Jamie Callan. Experiments using the LEMUR toolkit. In Proceedings of the Tenth Text Retrieval Conference (TREC-10), pages 103--108, 2001.
|
| |
28
|
Bo Pang and Lillian Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL, pages 271--278, 2004.
|
| |
29
|
Gabriel Pinski and Francis Narin. Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12:297--312, 1976.
|
 |
30
|
|
| |
31
|
|
 |
32
|
|
 |
33
|
|
| |
34
|
William J. Stewart. Introduction to the numerical solution of Markov chains. Princeton University Press, 1994.
|
 |
35
|
|
| |
36
|
Naftali Tishby and Noam Slonim. Data clustering by Markovian relaxation and the information bottleneck method. In Advances in Neural Information Processing Systems (NIPS) 14, pages 640--646, 2000.
|
| |
37
|
|
 |
38
|
Kristina Toutanova , Christopher D. Manning , Andrew Y. Ng, Learning random walk models for inducing word dependency distributions, Proceedings of the twenty-first international conference on Machine learning, p.103, July 04-08, 2004, Banff, Alberta, Canada
[doi> 10.1145/1015330.1015442]
|
| |
39
|
Peter Willett. Query specific automatic document classification. International Forum on Information and Documentation, 10(2):28--32, 1985.
|
 |
40
|
|
CITED BY 15
|
|
Jahna Otterbacher , Güneş Erkan , Dragomir R. Radev, Using random walks for question-focused sentence retrieval, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.915-922, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
Lingpeng Yang , Donghong Ji , Guodong Zhou , Yu Nie , Guozheng Xiao, Document re-ranking using cluster validation and label propagation, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
Donghui Feng , Erin Shaw , Jihie Kim , Eduard Hovy, Learning to detect conversation focus of threaded discussions, Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, p.208-215, June 04-09, 2006, New York, New York
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jingjing Liu , Wei Lai , Xian-Sheng Hua , Yalou Huang , Shipeng Li, Video search re-ranking via multi-graph propagation, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
|
|