skip to main content
article

Interest-based personalized search

Published: 01 February 2007 Publication History

Abstract

Web search engines typically provide search results without considering user interests or context. We propose a personalized search approach that can easily extend a conventional search engine on the client side. Our mapping framework automatically maps a set of known user interests onto a group of categories in the Open Directory Project (ODP) and takes advantage of manually edited data available in ODP for training text classifiers that correspond to, and therefore categorize and personalize search results according to user interests. In two sets of controlled experiments, we compare our personalized categorization system (PCAT) with a list interface system (LIST) that mimics a typical search engine and with a nonpersonalized categorization system (CAT). In both experiments, we analyze system performances on the basis of the type of task and query length. We find that PCAT is preferable to LIST for information gathering types of tasks and for searches with short queries, and PCAT outperforms CAT in both information gathering and finding types of tasks, and for searches associated with free-form queries. From the subjects' answers to a questionnaire, we find that PCAT is perceived as a system that can find relevant Web pages quicker and easier than LIST and CAT.

References

[1]
Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Sys. 30, 1--7, 107--117.
[2]
Broder, A. 2002. A taxonomy of Web search. ACM SIGIR Forum 36, 2, 3--10.
[3]
Budzik, J. and Hammond, K. 2000. User interactions with everyday applications as context for just-in-time information access. In Proceedings of the 5th International Conference on Intelligent User Interfaces. New Orleans, LA, 44--51.
[4]
Butler, D. 2000. Souped-up search engines. Nature 405, 112--115.
[5]
Carroll, J. and Rosson, M. B. 1987. The paradox of the active user. In Interfacing Thought: Cognitive Aspects of Human-Computer Interaction, J. M. Carroll, Ed. MIT Press, Cambridge, MA.
[6]
Chirita, P. A., Nejdl, W., Paiu, R., and Kohlschutter, C. 2005. Using ODP metadata to personalize search. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Salvador, Brazil, 178--185.
[7]
Craswell, N., Hawking, D., and Robertson, S. 2001. Effective site finding using link information. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New Orleans, LA, 250--257.
[8]
Cutting, D. R., Karger, D. R., Pedersen, J. O., and Tukey J. W. 1992. Scatter/Gather: A cluster-based approach to browsing large document collections. In Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Copenhagen, Denmark, 318--329.
[9]
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R. 1990. Indexing by latent semantic analysis. J. Amer. Soc. Inform. Sci. 41, 6, 391--407.
[10]
Dietterich, T. G. 1997. Machine learning research: Four current directions. AI Magazine 18, 4, 97--136.
[11]
Dreilinger, D. and Howe, A. E. 1997. Experiences with selecting search engines using metasearch. ACM. Inform. Sys. 15, 3, 195--222.
[12]
Dumais S. and Chen, H. 2001. Optimizing search by showing results in context. In Proceedings of Computer-Human Interaction. Seattle, WA, 277--284.
[13]
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., and Ruppin, E. 2002. Placing search in context: The concept revisited. ACM Trans. Inform. Syst. 20, 1, 116--131.
[14]
Gauch, S., Chaffee, J., and Pretschner, A. 2003. Ontology-based personalized search and browsing. Web Intell. Agent Syst. 1, 3/4, 219--234.
[15]
Glover, E., Lawrence, S., Brimingham, W., and Giles C. L. 1999. Architecture of a metasearch engine that supports user information needs. In Proceedings of the 8th International Conference on Information Knowledge Management. Kansas City, MO, 210--216.
[16]
Hafri, Y. and Djeraba, C. 2004. Dominos: A new Web crawler's design. In Proceedings of the 4th International Web Archiving Workshop (IWAW). Beth, UK.
[17]
Harris, Z. 1985. Distributional structure. In The Philosophy of Linguistics. Katz, J. J., Ed. Oxford University Press, Oxford, UK. 26--47.
[18]
Haveliwala, T. H. 2003. Topic-Sensitive PageRank. IEEE Trans. Knowl. Data Engin. 15, 4, 784--796.
[19]
Jansen, B. J., Spink, A., Bateman, J., and Saracevic, T. 1998. Real life information retrieval: A study of user queries on the Web. ACM SIGIR Forum 32, 1, 5--17.
[20]
Jansen, B. J., Spink, A., and Saracevic, T. 2000. Real life, real users, and real needs: A study and analysis of user queries on the Web. Inform. Process. Manag. 36, 2, 207--227.
[21]
Jansen, B. J., Spink, A., and Pederson, J. 2005. A temporal comparison of AltaVista Web searching. J. Amer. Soc. Inform. Sci. Techno. 56, 6, 559--570.
[22]
Jansen, B. J. and Spink, A. 2005. An analysis of Web searching by european AlltheWeb.com users. Inform. Process. Manage. 41, 361--381.
[23]
Jeh, G. and Widom J. 2003. Scaling personalized Web search. In Proceedings of the 12th International Conference on World Wide Web. Budapest, Hungary, 271--279.
[24]
Käki, M. 2005. Findex: Search result categories help users when document ranking Fails. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Portland, OR, 131--140.
[25]
Kraft, R., Maghoul, F., and Chang, C. C. 2005. Y!Q: Contextual search at the point of inspiration. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management. Bremen, Germany, 816--823.
[26]
Lawrence, S. 2000. Context in Web search. IEEE Data Engi. Bull. 23, 3, 25--32.
[27]
Leory, G., Lally, A. M., and Chen, H. 2003. The use of dynamic contexts to improve casual internet searching. ACM Trans. Inform. Syst. 21, 3, 229--253.
[28]
Liu, F., Yu, C., and Meng W. 2004. Personalized Web search for improving retrieval effectiveness. IEEE Trans. Knowl. Data Engin. 16, 1, 28--40.
[29]
Maltz, D. and Ehrlich, K. 1995. Pointing the way: Active collaborative filtering. In Proceedings of the Conference on Computer-Human Interaction. Denver, CO, 202--209.
[30]
Menczer, F., Pant, G., and Srinivasan, P. 2004. Topical Web crawlers: Evaluating adaptive algorithms. ACM Trans. Internet Techn. 4, 4, 378--419.
[31]
Miller, G., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K. 1990. Introduction to WORDNET: An online lexical database. Int. J. Lexico. 3, 4, 235--244.
[32]
Najork, M. and Heydon, A. 2001. High-performance Web crawling. In Handbook of Massive Data Sets, J. Abello, P. Pardalos, and M. Resende, Eds. Kluwer Academic Publishers, 25--45.
[33]
Oyama, S., Kokubo, T., and Ishida, T. 2004. Domain-specific Web search with keyword spices. IEEE Trans. Knowl. Data Engin. 16, 1, 17--27.
[34]
Pitkow, J., Schutze, H., Cass, T., Cooley, R., Turnbull, D., Edmonds, A., Adar, E., and Breuel, T. 2002. Personalized search. Commun. ACM 45, 9, 50--55.
[35]
Porter, M. 1980. An Algorithm for suffix stripping. Program 14, 3, 130--137.
[36]
Riloff, E. and Shepherd, J. A Corpus-based approach for building semantic lexicons. In Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing. Providence, RI, 117--124.
[37]
Salton, G. and Mcgill, M. J. 1986. Introduction to Modern Information Retrieval, McGraw-Hill, New York, NY.
[38]
Sebastiani, F. 2002. Machine learning in automated text categorization. ACM Comput. Surv. 34, 1, 1--47.
[39]
Sellen, A. J., Murphy, R., and Shaw, K. L. 2002. How knowledge workers use the Web. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Changing our World, Changing Ourselves. Minneapolis, MN, 227--234.
[40]
Shakes, J., Langheinrich, M., and Etzioni, O. 1997. Dynamic reference sifting: A Case study in the homepage domain. In Proceedings of the 6th International World Wide Web Conference. Santa Clara, CA, 189--200.
[41]
Shen, D., Chen, Z., Yang, Q., Zeng, H., Zhang, B., Lu, Y., and Ma, W. 2004. Web-page classification through summarization. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Sheffield, South Yorkshire, UK, 242--249.
[42]
Shen, X., Tan, B., and Zhai, C. X. 2005a. Context-sensitive information retrieval using implicit feedback. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Salvador, Brazil, 43--50.
[43]
Shen, X., Tan, B., and Zhai, C. X. 2005b. Implicit user modeling for personalized search. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management. Bremen, Germany, 824--831.
[44]
Speretta, M. and Gauch, S. 2005. Personalizing search based on user search histories. In Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence. Compiegne University of Technology, France, 622--628.
[45]
Srinivasan, P., Menczer, F., and Pant, G. 2005. A general evaluation framework for topical crawlers. Inform. Retriev. 8, 3, 417--447.
[46]
Sugiyama, K., Hatano, K., and Yoshikawa, M. 2004. Adaptive Web search based on user profile constructed without any effort from users. In Proceedings of the 13th International Conference on World Wide Web. New York, NY, 675--684.
[47]
Sullivan, D. 2000. NPD Search and portal site study. Search engine watch. http://searchenginewatch.com/sereport/article.php/2162791.
[48]
Tan, A. H. 2002. Personalized information management for Web intelligence. In Proceedings of World Congress on Computational Intelligence. Honolulu, HI, 1045--1050.
[49]
Tan, A. H. and Teo, C. 1998. Learning user profiles for personalized information dissemination. In Proceedings of International Joint Conference on Neural Network. Anchorage, AK, 183--188.
[50]
Teevan, J., Dumais, S. T., and Horvitz, E. 2005. Personalizing search via automated analysis of interests and activities. In Proceedings of 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Salvador, Brazil, 449--456.
[51]
Wen, J. R., Nie, J. Y., and Zhang, H. J. 2002. Query clustering using user logs. ACM Trans. Inform. Syst. 20, 1, 59--81.
[52]
Xu, J. and Croft W. B. 1996. Query expansion using local and global document analysis. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Zurich, Switzerland, 4--11.
[53]
Yang, Y. and Liu, X. 1999. A re-examination of text categorization methods. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Berkeley, CA, 42--49.
[54]
Zamir, O. and Etzioni, O. 1999. Grouper: A dynamic clustering interface to Web search results. Compu. Netw.: Int. J. Comput. Telecomm. Netw. 31, 11--16, 1361--1374.

Cited By

View all
  • (2024)Adaptive Web Crawling Strategies Based on Ontological User Interest Modeling for Personalized Content Retrieval2024 International Conference on Trends in Quantum Computing and Emerging Business Technologies10.1109/TQCEBT59414.2024.10545060(1-5)Online publication date: 22-Mar-2024
  • (2021)A qualitative study of large-scale recommendation algorithms for biomedical knowledge basesInternational Journal on Digital Libraries10.1007/s00799-021-00300-322:2(197-215)Online publication date: 1-Jun-2021
  • (2020)Personalization in text information retrievalJournal of the Association for Information Science and Technology10.1002/asi.2423471:3(349-369)Online publication date: 28-Jan-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems
ACM Transactions on Information Systems  Volume 25, Issue 1
February 2007
153 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/1198296
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 February 2007
Published in TOIS Volume 25, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Open Directory
  2. Personalized search
  3. World Wide Web
  4. information retrieval
  5. user interest
  6. user interface

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)50
  • Downloads (Last 6 weeks)2
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Adaptive Web Crawling Strategies Based on Ontological User Interest Modeling for Personalized Content Retrieval2024 International Conference on Trends in Quantum Computing and Emerging Business Technologies10.1109/TQCEBT59414.2024.10545060(1-5)Online publication date: 22-Mar-2024
  • (2021)A qualitative study of large-scale recommendation algorithms for biomedical knowledge basesInternational Journal on Digital Libraries10.1007/s00799-021-00300-322:2(197-215)Online publication date: 1-Jun-2021
  • (2020)Personalization in text information retrievalJournal of the Association for Information Science and Technology10.1002/asi.2423471:3(349-369)Online publication date: 28-Jan-2020
  • (2019)Motivating Effective Mobile App Adoptions: Evidence from a Large-Scale Randomized Field ExperimentInformation Systems Research10.1287/isre.2018.081530:2(523-539)Online publication date: Jun-2019
  • (2019)Research and Implementation of Character Analysis Algorithm Based on Text InformationComputer Science and Application10.12677/CSA.2019.91224509:12(2191-2207)Online publication date: 2019
  • (2018)On Resource Pooling and Separation for LRU CachingProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/31794082:1(1-31)Online publication date: 3-Apr-2018
  • (2018)Network Resilience and the Length-Bounded Multicut ProblemProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/31794072:1(1-26)Online publication date: 3-Apr-2018
  • (2018)Dynamic Proportional SharingProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/31794062:1(1-36)Online publication date: 3-Apr-2018
  • (2018)Proactive Information Retrieval by Capturing Search Intent from Primary Task ContextACM Transactions on Interactive Intelligent Systems10.1145/31509758:3(1-25)Online publication date: 5-Jul-2018
  • (2018)A semantic framework for ecommerce search engine optimizationInternational Journal of Information Technology10.1007/s41870-018-0232-yOnline publication date: 31-Jul-2018
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media