ABSTRACT
Different from traditional information retrieval, both content and structure are critical to the success of Web information retrieval. In recent years, many relevance propagation techniques have been proposed to propagate content information between web pages through web structure to improve the performance of web search. In this paper, we first propose a generic relevance propagation framework, and then provide a comparison study on the effectiveness and efficiency of various representative propagation models that can be derived from this generic framework. We come to many conclusions that are useful for selecting a propagation model in real-world search applications, including 1) sitemap-based propagation models outperform hyperlink-based models in sense of both effectiveness and efficiency, and 2) sitemap-based term propagation is easier to be integrated into real-world search engines because of its parallel offline implementation and acceptable complexity. Some other more detailed study results are also reported in the paper.
- Amento, B., Terveen, L., and Hill, W. Does "Authority" Mean Quality? Predicting Expert Quality Ratings of Web Pages. In Proc. ACM SIGIR 2000, pages 296--303. Google ScholarDigital Library
- Amitay, E., Carmel, D., Darlow, A., Lempel, R., and Soffer, A. Topic Distillation with Knowledge Agents, in the 11th TREC, 2002.Google Scholar
- Baeza-Yates, R., Ribeiro-Neto, B. Modern Information Retrieval, Addison Wesley, 1999. Google ScholarDigital Library
- Bharat, K., and Henzinger, M. R. Improved Algorithms for Topic Distillation in a Hyperlinked Environment. In Proceedings of the ACM-SIGIR, 1998. Google ScholarDigital Library
- Bharat, K., and Mihaila, G. A. When Experts Agree: Using Non-affiliated Experts to Rank Popular Topics. In 10th WWW, 2001. Google ScholarDigital Library
- Brin, S., and Page, L. The Anatomy of a Large Scale Hypertextual Web Search Engine, Proc. 7th WWW, 1998. Google ScholarDigital Library
- Broder, A. A Taxonomy of Web Search. SIGIR Forum 36(2), 2002. Google ScholarDigital Library
- Chakrabarti, S. Integrating the Page Object Model with hyperlinks for enhanced topic distillation and information extraction, In the 10th WWW, 2001. Google ScholarDigital Library
- Chakrabarti, S., Joshi, M., and Tawde, V. Enhanced Topic Distillation Using Text, Markup Tags, and Hyperlinks, In Proceedings of the 24th ACM SIGIR, 2001, pp. 208--216. Google ScholarDigital Library
- Craswell, N., Hawking, D. Overview of the TREC 2003 Web Track, in the 12th TREC, 2003.Google Scholar
- Craswell, N., Hawking, D. Overview of the TREC 2004 Web Track, in the 13th TREC, 2004.Google Scholar
- Feng, G., Liu, T. Y., Zhang, X. D., Qin. T., Gao, B., Ma, W. Y. Level-Based Link Analysis, in the 7th APWeb, 2005. Google ScholarDigital Library
- Haveliwala, T.H. Topic-Sensitive Pagerank. In Proc. of the 11th WWW, 2002. Google ScholarDigital Library
- Hawking, D. Overview of the TREC-9 Web Track, in the 9th TREC, 2000.Google Scholar
- Ingongngam, P., and Rungsawang, A. Report on the TREC 2003 Experiments Using Web Topic-Centric Link Analysis, in the 12th TREC, 2003.Google Scholar
- Kamvar, S. D., Haveliwala, T. H., Manning, C. D., Golub, G. H. Exploiting the Block Structure of the Web for Computing PageRank, In Proc. of the 13th WWW, 2003.Google Scholar
- Kleinberg, J. Authoritative Sources in a Hyperlinked Environment, Journal of the ACM, Vol. 46, No. 5, pp. 604--622, 1999. Google ScholarDigital Library
- Mcbryan, O. GENVL and WWWW: Tools for Taming the Web. In Proceedings of the 1st WWW, 1994.Google Scholar
- Page, L., Brin, S., Motwani, R., and Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web, Technical report, Stanford University, Stanford, CA, 1998.Google Scholar
- Robertson, S. E. Overview of the Okapi Projects, Journal of Documentation, Vol. 53, No. 1, 1997, pp. 3--7.Google ScholarCross Ref
- Robertson, S. E., and Sparck Jones, K. Relevance Weighting of Search Terms, Journal of the American Society of Information Science, Vol. 27, No. May-June, 1976, pp. 129--146.Google ScholarCross Ref
- Shakery, A., Zhai, C. X. Relevance Propagation for Topic Distillation UIUC TREC 2003 Web Track Experiments, in the 12th TREC, 2003.Google Scholar
- Song, R., Wen, J. R., Shi, S. M., Xin, G. M., Liu, T. Y., Qin, T., Zheng, X., Zhang, J. Y., Xue, G. R., and Ma, W. Y. Microsoft Research Asia at Web Track and Terabyte Track of TREC 2004, in the 13th TREC, 2004.Google Scholar
Index Terms
- A study of relevance propagation for web search
Recommendations
Popularity-based relevance propagation
It is evident that information resources on the World Wide Web (WWW) are growing rapidly with unpredictable rate. Under these circumstances, web search engines help users to find useful information. Ranking the retrieved results is the main challenge of ...
Slash-based relevance propagation model for topic distillation
An efficient and effective ranking mechanism in the search engines remains as a challenging problem. In recent years, a few relevance propagation models like Hyperlink-based score propagation, Hyperlink-based term propagation, and Popularity-based ...
Structural analysis of relevance propagation models
AbstractRelevance relations constitute the core of information retrieval. Topical ontologies, such as collaborative webpage classification projects, can provide a basis for identifying and analyzing such relations. New meaningful relevance ...
Highlights- Relevance propagation models derived from a topic ontology are analyzed.
- ...
Comments