skip to main content
10.1145/1135777.1136006acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

Focused crawling: experiences in a real world project

Published: 23 May 2006 Publication History

Abstract

No abstract available.

References

[1]
C. Chung and C. L. A. Clarke Topic-oriented collaborative crawling, in Proceedings of the 2002 ACM CIKM, pages 34--42.
[2]
S. Chakrabarti, M. van den Berg, and B. E. Dom Focused Crawling: A New Approach to Topic-specific Web Resource Discovery, Computer Networks, 31(11--16), 1999, pages 1623-1640.
[3]
S. Chakrabarti, M. Joshi, and V. Tawde Enhanced Topic Distillation using Text, Markup Tags and Hyperlinks, in Proceedings of the ACM SIGIR, 2001.
[4]
M. Diligenti, F. Coetzee, S. Lawrence, C. L. Giles and M. Gori, Focused Crawling Using Context Graphs, in Proceedings of VLDB 2000, pages 527--534.
[5]
Ehrig, M. and Maedche, A. Ontology-Focused Crawling of Web Documents, in Proceedings of the ACM Symposium on Applied Computing, 2003.
[6]
J. Hou and Y. Zhang, Effectively Finding Relevant Web Pages from Linkage Information, IEEE TKDE, 15(4), July/August 2003, pp. 940--951.

Cited By

View all
  • (2019)Stream-based live public opinion monitoring approach with adaptive probabilistic topic modelSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-018-3391-723:16(7451-7470)Online publication date: 1-Aug-2019
  • (2013)A user-oriented web crawler for selectively acquiring online content in e-health researchBioinformatics10.1093/bioinformatics/btt57130:1(104-114)Online publication date: 29-Sep-2013
  • (2012)A hybrid method for improving the SQD-PageRank algorithmSecond International Conference on the Innovative Computing Technology (INTECH 2012)10.1109/INTECH.2012.6457747(231-238)Online publication date: Sep-2012
  • Show More Cited By

Index Terms

  1. Focused crawling: experiences in a real world project

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '06: Proceedings of the 15th international conference on World Wide Web
    May 2006
    1102 pages
    ISBN:1595933239
    DOI:10.1145/1135777
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 May 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. crawling
    2. information retrieval
    3. thesaurus
    4. topic

    Qualifiers

    • Article

    Conference

    WWW06
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Stream-based live public opinion monitoring approach with adaptive probabilistic topic modelSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-018-3391-723:16(7451-7470)Online publication date: 1-Aug-2019
    • (2013)A user-oriented web crawler for selectively acquiring online content in e-health researchBioinformatics10.1093/bioinformatics/btt57130:1(104-114)Online publication date: 29-Sep-2013
    • (2012)A hybrid method for improving the SQD-PageRank algorithmSecond International Conference on the Innovative Computing Technology (INTECH 2012)10.1109/INTECH.2012.6457747(231-238)Online publication date: Sep-2012
    • (2010)Language specific crawling based on web pages features2010 International Conference on Multimedia Computing and Information Technology (MCIT)10.1109/MCIT.2010.5444865(17-20)Online publication date: Mar-2010
    • (2010)A Crawler for Local SearchProceedings of the 2010 Fourth International Conference on Digital Society10.1109/ICDS.2010.23(86-91)Online publication date: 10-Feb-2010
    • (2009)Improving the performance of focused web crawlersData & Knowledge Engineering10.1016/j.datak.2009.04.00268:10(1001-1013)Online publication date: 1-Oct-2009
    • (2008)A Framework of a Hybrid Focused Web CrawlerProceedings of the 2008 Second International Conference on Future Generation Communication and Networking Symposia - Volume 0210.1109/FGCNS.2008.73(50-53)Online publication date: 13-Dec-2008

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media