skip to main content
10.1145/1526709.1526886acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
poster

Detecting soft errors by redirection classification

Published: 20 April 2009 Publication History

Abstract

A soft error redirection is a URL redirection to a page that returns the HTTP status code 200 (OK) but has actually no relevant content to the client request. Since such redirections degrade the performance of web search engines in many ways, it is highly desirable to remove as many of them as possible. We propose a novel approach to detect soft error redirections by analyzing redirection logs collected during crawling operation. Experimental results on huge crawl data show that our measure can classify soft error redirections effectively.

References

[1]
Wispon. http://www.wispon.com.
[2]
Z. Bar-Yossef, A. Z. Broder, R. Kumar, and A. Tomkins. Sic transit gloria telae: towards an understanding of the web's decay. In WWW '04.

Cited By

View all
  • (2022)Scalability Challenges in Web Search EnginesundefinedOnline publication date: 10-Mar-2022
  • (2014)Moved but not goneInternational Journal on Digital Libraries10.1007/s00799-014-0108-014:1-2(17-38)Online publication date: 1-Apr-2014
  • (2013)Analysis and detection of Soft-404 pagesThird International Conference on Innovative Computing Technology (INTECH 2013)10.1109/INTECH.2013.6653695(217-226)Online publication date: Aug-2013

Index Terms

  1. Detecting soft errors by redirection classification

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '09: Proceedings of the 18th international conference on World wide web
        April 2009
        1280 pages
        ISBN:9781605584874
        DOI:10.1145/1526709

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 20 April 2009

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. search engine
        2. soft 404
        3. spam
        4. url redirection

        Qualifiers

        • Poster

        Conference

        WWW '09
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)1
        • Downloads (Last 6 weeks)1
        Reflects downloads up to 22 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2022)Scalability Challenges in Web Search EnginesundefinedOnline publication date: 10-Mar-2022
        • (2014)Moved but not goneInternational Journal on Digital Libraries10.1007/s00799-014-0108-014:1-2(17-38)Online publication date: 1-Apr-2014
        • (2013)Analysis and detection of Soft-404 pagesThird International Conference on Innovative Computing Technology (INTECH 2013)10.1109/INTECH.2013.6653695(217-226)Online publication date: Aug-2013

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media