skip to main content
10.1145/501516.501556acmconferencesArticle/Chapter ViewAbstractPublication PagesdocConference Proceedingsconference-collections
Article

Finding scientific papers with homepagesearch and MOPS

Published:21 October 2001Publication History

ABSTRACT

The fast dissemination of new research results on the world-wide web poses new challenges for search engines. In this paper we describe a new approach to seek scientific papers relevant to a pre-defined research area. Different from other approaches, we do not search for web pages which contain certain keywords, but we search for web pages which are created by scientists who are active in the research area under consideration. The names of these scientists are obtained from the DBLP server [9]. The HomePageSearch system finds the Home Pages according to the names, and Mops finds research papers close to the Home Pages. It creates an index of these papers and makes it accessible on the web. We conclude that such a focused crawling is very effective for building high-quality collections and indices of scientific papers, using ordinary desktop hardware.

References

  1. 1.Alf-Christian Achilles.The Collection of Computer Science Bibliographies. http://liinwww.ira.uka.de/bibliography/.]]Google ScholarGoogle Scholar
  2. 2.ACM.The ACM Research Repository. http://www.acm.org/rep sitory/.]]Google ScholarGoogle Scholar
  3. 3.Ricardo Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval . Addison-Wesley-Longman,May 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.Andrew Birrell and Paul McJones.pstotext. http://www.research.digital.com/SRC/virtualpaper/- pstotext.html .]]Google ScholarGoogle Scholar
  5. 5.Soumen Chakrabarti,Martin van den Berg,and Byron Dom.Focused cra ling:A new approach to topic-speci .c eb resource discovery.In Proceedings f the Eighth World-Wide Web Conference ,1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.IBM.DB2 Product Family. http://www.software.ibm.c m/data/db2/.]]Google ScholarGoogle Scholar
  7. 7.David M.Jones.The Hypertext Bibliography Project. http://the ry.lcs.mit.edu/~dmjones/hbp/.]]Google ScholarGoogle Scholar
  8. 8.Steve Lawrence,C.Lee Giles,and Kurt Bollacker. Digital libraries and autonomous citation indexing. IEEE Computer ,32(6):67 -71,1999. http://citeseer.nj.nec.com/cs/.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.Michael Ley.DBLP Computer Science Bibliography. http://dblp.uni-trier.de/.]]Google ScholarGoogle Scholar
  10. 10.Y.H.Li and Anil K.Jain.Classi .cation of text documents.The Computer Journal ,41(8):537 -546, 1998.]]Google ScholarGoogle ScholarCross RefCross Ref
  11. 11.New Zealand Digital Library.Computer Science Technical Reports. http://www.nzdl.org/.]]Google ScholarGoogle Scholar
  12. 12.Udi Manber and Sun Wu.GLIMPSE:a tool to search through entire .le systems.In Usenix Winter 1994 Technical Conference ,pages 23 -32,1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.Andrew McCallum,Kamal Nigam,Jason Rennie,and Kristie Seymore.Automating the construction of internet portals ith machine learning.Information Retrieval 3(2):127-163,2000. http://www.c ra.jprc.com/.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.Thomas M.Mitchell.Machine Learning .McGraw-Hill, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.NCSTRL.Net orked Computer Science Technical Library. http://www.ncstrl.org/.]]Google ScholarGoogle Scholar
  16. 16.Jason Rennie and Andrew McCallum.Using reinforcement learning to spider the eb e .ciently.In Proceedings f the Sixteenth Internati nal C nference on Machine Learning ,1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.Jonathan Shakes,Marc Langheinrich,and Oren Etzioni.Dynamic reference sifting:A case study in the homepage domain.In Proceedings f the Sixth International World Wide Web Conference ,1997. http://ahoy.cs.washington.edu:6060/.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Finding scientific papers with homepagesearch and MOPS

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGDOC '01: Proceedings of the 19th annual international conference on Computer documentation
          October 2001
          272 pages
          ISBN:1581132956
          DOI:10.1145/501516

          Copyright © 2001 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 21 October 2001

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate355of582submissions,61%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader