skip to main content
10.1145/1571941.1571982acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Using anchor texts with their hyperlink structure for web search

Published: 19 July 2009 Publication History

Abstract

As a good complement to page content, anchor texts have been extensively used, and proven to be useful, in commercial search engines. However, anchor texts have been assumed to be independent, whether they come from the same Web site or not. Intuitively, an anchor text from unrelated Web sites should be considered as stronger evidence than that from the same site. This paper proposes two new methods to take into account the possible relationships between anchor texts. We consider two relationships in this paper: links from the same site and links from related sites. The importance assigned to the anchor texts in these two situations is discounted. Experimental results show that these two new models outperform the baseline model which assumes independence between hyperlinks.

References

[1]
E. Amitay and C. Paris. Automatically summarising web sites: is there a way around it? In Proceedings of CIKM '00, pages 173--179. ACM, 2000.
[2]
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In Proceedings of WWW '98, pages 107--117, 1998.
[3]
A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002.
[4]
C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proceedings of ICML '05, pages 89--96, New York, NY, USA, 2005. ACM Press.
[5]
N. Craswell, D. Hawking, and S. Robertson. Effective site finding using link anchor information. In Proceedings of SIGIR '01, pages 250--257. ACM, 2001.
[6]
N. Eiron and K.S. McCurley. Analysis of anchor text for web search. In Proceedings of SIGIR '03, pages 459--460, New York, NY, USA, 2003. ACM.
[7]
A. Fujii. Modeling anchor text and classifying queries to enhance web document retrieval. In Proceeding of WWW '08, pages 337--346. ACM, 2008.
[8]
A. Fujii, K. Itou, T. Akiba, and T. Ishikawa. Exploiting anchor text for the navigational web retrieval at ntcir-5. In Proceedings of NTCIR-5 Workshop Meeting, 2005.
[9]
K. Järvelin and J. Kekäläinen. Ir evaluation methods for retrieving highly relevant documents. In Proceedings of SIGIR '00, pages 41--48, New York, NY, USA, 2000. ACM Press.
[10]
J.M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604--632, 1999.
[11]
R. Kraft and J. Zien. Mining anchor text for query refinement. In Proceedings of WWW '04, pages 666--674. ACM, 2004.
[12]
U. Lee, Z. Liu, and J. Cho. Automatic identification of user goals in web search. In Proceedings of WWW '05, pages 391--400, New York, NY, USA, 2005. ACM Press.
[13]
W.-H. Lu, L.-F. Chien, and H.-J. Lee. Anchor text mining for translation of web queries: A transitive translation approach. ACM Transaction on Information System, 22(2):242--269, 2004.
[14]
J.M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proceedings of SIGIR '98, pages 275--281. ACM, 1998.
[15]
S. Robertson, H. Zaragoza, and M. Taylor. Simple bm25 extension to multiple weighted fields. In Proceedings of CIKM '04, pages 42--49. ACM, 2004.
[16]
S.E. Robertson, S. Walker, S. Jones, M.M. Hancock-beaulieu, and M. Gatford. Okapi at trec-3. In Proceedings of TREC-3, pages 109--126, 1995.
[17]
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Inf. Process. Manage., 24(5):513--523, 1988.
[18]
E. Voorhees. Trec-8 question answering track report. In Proceedings of the 8th Text Retrieval Conference, pages 77--82, 1999.
[19]
T. Westerveld, W. Kraaij, and D. Hiemstra. Retrieving web pages using content, links, urls and anchors. In Proceedings of the 10th Text REtrieval Conference, pages 663--672, 2001.

Cited By

View all
  • (2023)Unsupervised Dense Retrieval Training with Web AnchorsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592080(2476-2480)Online publication date: 19-Jul-2023
  • (2022)The Power of Anchor Text in the Neural Retrieval EraAdvances in Information Retrieval10.1007/978-3-030-99736-6_38(567-583)Online publication date: 5-Apr-2022
  • (2021)Pre-training for Ad-hoc RetrievalProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482286(1212-1221)Online publication date: 26-Oct-2021
  • Show More Cited By

Index Terms

  1. Using anchor texts with their hyperlink structure for web search

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
    July 2009
    896 pages
    ISBN:9781605584836
    DOI:10.1145/1571941
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. anchor text
    2. hyperlink structure
    3. web site relationship

    Qualifiers

    • Research-article

    Conference

    SIGIR '09
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 22 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Unsupervised Dense Retrieval Training with Web AnchorsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592080(2476-2480)Online publication date: 19-Jul-2023
    • (2022)The Power of Anchor Text in the Neural Retrieval EraAdvances in Information Retrieval10.1007/978-3-030-99736-6_38(567-583)Online publication date: 5-Apr-2022
    • (2021)Pre-training for Ad-hoc RetrievalProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482286(1212-1221)Online publication date: 26-Oct-2021
    • (2021)Extended User Preference Based Weighted Page Ranking Algorithm2021 National Computing Colleges Conference (NCCC)10.1109/NCCC49330.2021.9428844(1-6)Online publication date: 27-Mar-2021
    • (2020)Selective Weak Supervision for Neural Information RetrievalProceedings of The Web Conference 202010.1145/3366423.3380131(474-485)Online publication date: 20-Apr-2020
    • (2019)LinkLiveWorld Wide Web10.1007/s11280-018-0621-y22:4(1699-1725)Online publication date: 1-Jul-2019
    • (2018)Quantifying retrieval bias in Web archive searchInternational Journal on Digital Libraries10.1007/s00799-017-0215-919:1(57-75)Online publication date: 1-Mar-2018
    • (2016)Comparing Topic Coverage in Breadth-First and Depth-First Crawls Using Anchor TextsResearch and Advanced Technology for Digital Libraries10.1007/978-3-319-43997-6_11(133-146)Online publication date: 10-Aug-2016
    • (2015)Ranking algorithm for book reviews with user tendency and collective intelligenceMultimedia Tools and Applications10.1007/s11042-014-2101-474:16(6209-6227)Online publication date: 1-Aug-2015
    • (2015)Lost but not forgotten: finding pages on the unarchived webInternational Journal on Digital Libraries10.1007/s00799-015-0153-316:3-4(247-265)Online publication date: 3-Jun-2015
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media