ACM Home Page
Please provide us with feedback. Feedback
First-order focused crawling
Full text PdfPdf (78 KB)
Source
International World Wide Web Conference archive
Proceedings of the 16th international conference on World Wide Web table of contents
Banff, Alberta, Canada
POSTER SESSION: Search table of contents
Pages: 1159 - 1160  
Year of Publication: 2007
ISBN:978-1-59593-654-7
Authors
Qingyang Xu  Jilin University
Wanli Zuo  Jilin University
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 111,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1242572.1242744
What is a DOI?

ABSTRACT

This paper reports a new general framework of focused web crawling based on "relational subgroup discovery". Predicates are used explicitly to represent the relevance clues of those unvisited pages in the crawl frontier, and then first-order classification rules are induced using subgroup discovery technique. The learned relational rules with sufficient support and confidence will guide the crawling process afterwards. We present the many interesting features of our proposed first-order focused crawler, together with preliminary promising experimental results.