ACM Home Page
Please provide us with feedback. Feedback
PageRank without hyperlinks: structural re-ranking using links induced by language models
Full text PdfPdf (189 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Salvador, Brazil
SESSION: Theory 2 table of contents
Pages: 306 - 313  
Year of Publication: 2005
ISBN:1-59593-034-5
Authors
Oren Kurland  Cornell University, Ithaca, NY and Carnegie Mellon University, Pittsburgh, PA
Lillian Lee  Mellon University, Pittsburgh, PA
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 285,   Citation Count: 15
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1076034.1076087
What is a DOI?

ABSTRACT

Inspired by the PageRank and HITS (hubs and authorities) algorithms for Web search, we propose a structural re-ranking approach to ad hoc information retrieval: we reorder the documents in an initially retrieved set by exploiting asymmetric relationships between them. Specifically, we consider generation links, which indicate that the language model induced from one document assigns high probability to the text of another; in doing so, we take care to prevent bias against long documents. We study a number of re-ranking criteria based on measures of centrality in the graphs formed by generation links, and show that integrating centrality into standard language-model-based retrieval is quite effective at improving precision at top ranks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
Güneş Erkan and Dragomir R. Radev. LexRank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22:457--479, 2004.
 
5
Eugene Garfield. Citation analysis as a tool in journal evaluation. Science, 178:471--479, 1972.
 
6
Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press, third edition, 1996.
 
7
Winfried K. Grassmann, Michael I. Taksar, and Daniel P. Heyman. Regenerative analysis and steady state distributions for Markov chains. Operations Research, 33(5):1107--1116, 1985.
 
8
Geoffrey R. Grimmett and David R. Stirzaker. Probability and Random Processes. Oxford Science Publications, third edition, 2001.
 
9
10
 
11
Djoerd Hiemstra and Wessel Kraaij. Twenty-One at TREC7: Ad hoc and cross-language track. In Proceedings of the Seventh Text Retrieval Conference (TREC-7), pages 227--238, 1999.
 
12
Thorsten Joachims. Transductive learning via spectral graph partitioning. In Proceedings of ICML, 2003.
13
 
14
Wessel Kraaij and Thijs Westerveld. TNO-UT at TREC9: How different are web documents? In Proceedings of the Ninth Text Retrieval Conference (TREC-9), pages 665--671, 2001.
15
16
17
18
 
19
Victor Lavrenko, James Allan, Edward DeGuzman, Daniel LaFlamme, Veera Pollard, and Steven Thomas. Relevance models for topic detection and tracking. In Proceedings of the Human Language Technology Conference (HLT), pages 104--110, 2002.
20
21
22
 
23
Rada Mihalcea. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In The Companion Volume to the Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, pages 170--173, 2004.
 
24
Rada Mihalcea and Paul Tarau. TextRank: Bringing order into texts. In Proceedings of EMNLP, pages 404--411, 2004. Poster.
25
 
26
Kenney Ng. A maximum likelihood ratio information retrieval model. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), pages 483--492, 2000.
 
27
Paul Ogilvie and Jamie Callan. Experiments using the LEMUR toolkit. In Proceedings of the Tenth Text Retrieval Conference (TREC-10), pages 103--108, 2001.
 
28
Bo Pang and Lillian Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL, pages 271--278, 2004.
 
29
Gabriel Pinski and Francis Narin. Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12:297--312, 1976.
30
 
31
32
33
 
34
William J. Stewart. Introduction to the numerical solution of Markov chains. Princeton University Press, 1994.
35
 
36
Naftali Tishby and Noam Slonim. Data clustering by Markovian relaxation and the information bottleneck method. In Advances in Neural Information Processing Systems (NIPS) 14, pages 640--646, 2000.
 
37
38
 
39
Peter Willett. Query specific automatic document classification. International Forum on Information and Documentation, 10(2):28--32, 1985.
40

CITED BY  15
 
 
 
 
 

Collaborative Colleagues:
Oren Kurland: colleagues
Lillian Lee: colleagues