| Hits on the web: how does it compare? |
| Full text |
Pdf
(243 KB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Amsterdam, The Netherlands
SESSION: Link analysis
table of contents
Pages: 471 - 478
Year of Publication: 2007
ISBN:978-1-59593-597-7
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 40, Downloads (12 Months): 317, Citation Count: 1
|
|
|
ABSTRACT
This paper describes a large-scale evaluation of the
effectiveness of HITS in comparison with other link-based ranking
algorithms, when used in combination with a state-of-the-art text
retrieval algorithm exploiting anchor text. We quantified their
effectiveness using three common performance measures: the mean
reciprocal rank, the mean average precision, and the normalized
discounted cumulative gain measurements. The evaluation is based on
two large data sets: a breadth-first search crawl of 463 million
web pages containing 17.6 billion hyperlinks and referencing 2.9
billion distinct URLs; and a set of 28,043 queries sampled from a
query log, each query having on average 2,383 results, about 17 of
which were labeled by judges. We found that HITS outperforms
PageRank, but is about as effective as web-page in-degree. The same
holds true when any of the link-based features are combined with
the text retrieval algorithm. Finally, we studied the relationship
between query specificity and the effectiveness of selected
features, and found that link-based features perform better for
general queries, whereas BM25F performs better for specific
queries.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
Allan Borodin , Gareth O. Roberts , Jeffrey S. Rosenthal , Panayiotis Tsaparas, Finding authorities and hubs from link structures on the World Wide Web, Proceedings of the 10th international conference on World Wide Web, p.415-429, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.372096]
|
 |
4
|
Allan Borodin , Gareth O. Roberts , Jeffrey S. Rosenthal , Panayiotis Tsaparas, Link analysis ranking: algorithms, theory, and experiments, ACM Transactions on Internet Technology (TOIT), v.5 n.1, p.231-297, February 2005
[doi> 10.1145/1052934.1052942]
|
| |
5
|
|
 |
6
|
Chris Burges , Tal Shaked , Erin Renshaw , Ari Lazier , Matt Deeds , Nicole Hamilton , Greg Hullender, Learning to rank using gradient descent, Proceedings of the 22nd international conference on Machine learning, p.89-96, August 07-11, 2005, Bonn, Germany
[doi> 10.1145/1102351.1102363]
|
| |
7
|
|
 |
8
|
|
| |
9
|
E. Garfield. Citation analysis as a tool in journal evaluation. Science 178(4060):471--479, 1972.
|
| |
10
|
Z. Gyöngyi and H. Garcia-Molina. Web spam taxonomy. In 1st International Workshop on Adversarial Information Retrieval on the Web 2005.
|
| |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
M.M. Kessler. Bibliographic coupling between scientific papers. American Documentation 14(1):10--25, 1963.
|
| |
16
|
|
 |
17
|
|
| |
18
|
A.N. Langville and C.D. Meyer. Deeper inside PageRank. Internet Mathematics 1(3):2005, 335--380.
|
| |
19
|
|
 |
20
|
|
| |
21
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
|
 |
22
|
|
| |
23
|
T. Upstill, N. Craswell, and D. Hawking. Predicting fame and fortune: Pagerank or indegree? In Proc. of the Australasian Document Computing Symposium pages 31--40, 2003.
|
| |
24
|
H. Zaragoza, N. Craswell, M. Taylor, S. Saria, and S. Robertson. Microsoft Cambridge at TREC-13: Web and HARD tracks. In Proc. of the 13th Text Retrieval Conference 2004.
|
|