skip to main content
article

Challenges in web search engines

Published:01 September 2002Publication History
Skip Abstract Section

Abstract

This article presents a high-level discussion of some problems in information retrieval that are unique to web search engines. The goal is to raise awareness and stimulate research in these areas.

References

  1. H. Ahonen, H. Mannila, and E. Nikunen. "Generating grammars for SGML tagged texts lacking DTD." PODP'94 - Worskhop on Principles of Document Processing, 1994. http://www.cs.Helsinki.FI/u/hahonen/publications.html.Google ScholarGoogle Scholar
  2. G. K. Berland, M. N. Elliott, L. S. Morales, J. I. Algazy, R. L. Kravitz, M. S. Broder, D. E. Kanouse, J. A. Muñoz, J.-A. Puyol, M. Lara, K. E. Watkins, H. Yang, and E. A. McGlynn. "Health Information on the Internet Accessibility, Quality, and Readability in English and Spanish." Journal of the American Medical Association, 285(2001): 2612-2621.Google ScholarGoogle ScholarCross RefCross Ref
  3. K. Bharat, A. Z. Broder, J. Dean, and M. Henzinger. "A comparison of Techniques to Find Mirrored Hosts on the World Wide Web." Journal of the American Society for Information Science, 31(2000): 1114-1122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Brin, J. Davis, and H. García-Molina. "Copy detection mechanisms for digital documents." Proceedings of the ACM SIGMOD International Conference on Management of Data, 1995, pages 398-409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Brin, and L. Page. "The Anatomy of a Large-Scale Hypertextual Web Search Engine." In Proceedings of the 7th International World Wide Web Conference (WWW7), 1998, pages 107-117. Also appeared in Computer Networks 30(1998): 107-117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Brin, L. Page, R. Motwani, and T. Winograd. "What can you do with a Web in your Pocket?" Bulletin of the Technical Committee on Data Engineering, 21(1998): 37-47.Google ScholarGoogle Scholar
  7. A. Z. Broder. "On the resemblance and containment of documents." In Proceedings of Compression and Complexity of Sequences, IEEE Computer Society, 1997, pages 21-29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Chakrabarti. Enhanced topic distillation using text, markup tags, and hyperlinks. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chakrabarti. Integrating the Document Object Model with hyperlinks for enhanced topic distillation and information extraction. In Proceedings of the 10th International World Wide Web Conference (WWW10), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Cho, N. Shivakumar, and H. Garcia-Molina. "Finding replicated web collections." In Proceedings of the ACM SIGMOD International Conference on Management of Data, 2000, pages 355-366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Craswell, D. Hawking, and S. Robertson. "Effective Site Finding using Link Anchor Information." In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Faraday. "Attending to Web Pages." CHI 2001 Extended Abstracts (Poster), 2001, pages 159-160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Kleinberg. "Authoritative sources in a hyperlinked environment." In Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, 1998, pages 668-677. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Joachims. "Evaluation Search Engines using Clickthrough Data". To appear, 2002.Google ScholarGoogle Scholar
  15. S. Nestorov, S. Abiteboul, and R. Motwani. "Extracting Schema from Semistructured Data." In Proceedings of the ACM SIGMOD Conference on Management of Data, 1998, pages 295-306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Ravi Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins. "Trawling emerging cyber-communities automatically." In Proceedings of the 8th International World Wide Web Conference (WWW8), 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Silverstein, M. R. Henzinger, J. Marais, and M. Moricz. "Analysis of a very large Alta Vista query log." SIGIR Forum, 33(1999): 6-12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. World Wide Web Consortium. "Web Style Sheets." http://www.w3.org/Style/.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGIR Forum
    ACM SIGIR Forum  Volume 36, Issue 2
    Fall 2002
    99 pages
    ISSN:0163-5840
    DOI:10.1145/792550
    Issue’s Table of Contents

    Copyright © 2002 Authors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 1 September 2002

    Check for updates

    Qualifiers

    • article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader