ABSTRACT
An expert finding system allows a user to type a simple text query and retrieve names and contact information of individuals that possess the expertise expressed in the query. This paper proposes a novel approach to expert finding in large enterprises or intranets by modeling candidate experts (persons), web documents and various relations among them with so-called expertise graphs. As distinct from the state of-the-art approaches estimating personal expertise through one-step propagation of relevance probability from documents to the related candidates, our methods are based on the principle of multi-step relevance propagation in topic specific expertise graphs. We model the process of expert finding by probabilistic random walks of three kinds: finite, infinite and absorbing. Experiments on TREC Enterprise Track data originating from two large organizations show that our methods using multi-step relevance propagation improve over the baseline one-step propagation based method in almost all cases.
- IBM Professional Marketplace matches consultants with clients. White paper. November 2006.Google Scholar
- Enterprise search from Microsoft: Empower people to find information and expertise. White paper. Microsoft, January 2007.Google Scholar
- M. S. Ackerman, V. Wulf, and V. Pipek. Sharing Expertise: Beyond Knowledge Management. MIT Press, Cambridge, MA, USA, 2002. Google ScholarDigital Library
- E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne. Finding high-quality content in social media. In WSDM '08: Proceedings of the international conference on Web search and web data mining, pages 183--194, 2008. Google ScholarDigital Library
- K. Balog, L. Azzopardi, and M. de Rijke. Formal models for expert finding in enterprise corpora. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 43--50, 2006. Google ScholarDigital Library
- K. Balog, T. Bogers, L. Azzopardi, M. de Rijke, and A. van den Bosch. Broad expertise retrieval in sparse data environments. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 551--558, 2007. Google ScholarDigital Library
- K. Balog and M. de Rijke. Finding experts and their eetails in e-mail corpora. In WWW '06: Proceedings of the 15th international conference on World Wide Web, pages 1035--1036, 2006. Google ScholarDigital Library
- I. Becerra-Fernandez. Facilitating the online search of experts at NASA using expert seeker people-finder. In PAKM'00, Third International Conference on Practical Aspects of Knowledge Management, 2000.Google Scholar
- M. Bilenko and R. W. White. Mining the search trails of surfing crowds: identifying relevant websites from user activity. In WWW '08: Proceeding of the 17th international conference on World Wide Web, pages 51--60, 2008. Google ScholarDigital Library
- C. S. Campbell, P. P. Maglio, A. Cozzi, and B. Dom. Expertise identification using email communications. In CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management, pages 528--531, 2003. Google ScholarDigital Library
- Y. Cao, J. Liu, S. Bao, and H. Li. Research on expert search at enterprise track of trec 2005. In Proceedings of 14th Text Retrieval Conference (TREC 2005), 2005.Google Scholar
- H. Chen, H. Shen, J. Xiong, S. Tan, and X. Cheng. Social Network Structure behind the Mailing Lists: ICT-IIIS at TREC 2006 Expert Finding Track. In Proceeddings of the 15th Text REtrieval Conference (TREC 2006), 2006.Google Scholar
- K. Collins-Thompson and J. Callan. Query expansion using random walk models. In CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management, pages 704--711, 2005. Google ScholarDigital Library
- N. Craswell, A. de Vries, and I. Soboroff. Overview of the trec-2005 enterprise track. In Proceedings of TREC-2005, Gaithersburg, USA, 2005.Google Scholar
- N. Craswell, D. Hawking, A.-M. Vercoustre, and P. Wilkins. Panoptic expert: Searching for experts not just for documents. In Ausweb Poster Proceedings, Queensland, Australia, 2001.Google Scholar
- N. Craswell and M. Szummer. Random walks on the click graph. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 239--246, 2007. Google ScholarDigital Library
- F. Crestani. Application of spreading activation techniques in information retrieval. Artif. Intell. Rev., 11(6):453--482, 1997. Google ScholarDigital Library
- T. Davenport. Knowledge Management at Microsoft. White paper. 1997.Google Scholar
- T. Davenport. Ten principles of knowledge management and four case studies. Knowledge and Process Management, 4(3), 1998.Google Scholar
- L. Fields. 3 great databases for finding experts. The Expert Advisor, (3), March 2007.Google Scholar
- S. Harabagiu, F. Lacatusu, and A. Hickl. Answering complex questions with random walk models. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 220--227, 2006. Google ScholarDigital Library
- D. Hawking. Challenges in enterprise search. In ADC '04: Proceedings of the 15th Australasian database conference, pages 15--24, Darlinghurst, Australia, Australia, 2004. Google ScholarDigital Library
- D. Hiemstra. Using Language Models for Information Retrieval. Phd thesis, University of Twente, 2001.Google Scholar
- D. Hiemstra, H. Rode, R. van Os, and J. Flokstra. Pftijah: text search in an xml database system. In Proceedings of the 2nd International Workshop on Open Source Information Retrieval (OSIR), pages 12--17, August 2006.Google Scholar
- M. Idinopulos and L. Kempler. Do you know who your experts are? The McKinsey Quarterly, (4), 2003.Google Scholar
- G. Jeh and J. Widom. Scaling personalized web search. In WWW '03: Proceedings of the 12th international conference on World Wide Web, pages 271--279, 2003. Google ScholarDigital Library
- P. Jurczyk and E. Agichtein. Discovering authorities in question answer communities by using link analysis. In CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 919--922, 2007. Google ScholarDigital Library
- J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604--632, 1999. Google ScholarDigital Library
- O. Kurland and L. Lee. Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 83--90, 2006. Google ScholarDigital Library
- J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 111--119, 2001. Google ScholarDigital Library
- R. Lempel and S. Moran. Salsa: the stochastic approach for link-structure analysis. ACM Trans. Inf. Syst., 19(2):131--160, 2001. Google ScholarDigital Library
- X. Liu, W. B. Croft, and M. Koll. Finding experts in community-based question-answering services. In CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management, pages 315--316, 2005. Google ScholarDigital Library
- W. Lu, S. Robertson, A. Macfarlane, and H. Zhao. Window-based Enterprise Expert Search. In Proceeddings of the 15th Text REtrieval Conference (TREC 2006), 2006.Google Scholar
- C. Macdonald and I. Ounis. Voting for candidates: adapting data fusion techniques for an expert search task. In CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management, pages 387--396, 2006. Google ScholarDigital Library
- M. T. Maybury. Expert finding systems. Technical Report MTR06B000040, MITRE Corporation, 2006.Google Scholar
- M. A. Najork, H. Zaragoza, and M. J. Taylor. Hits on the web: how does it compare? In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 471--478, 2007. Google ScholarDigital Library
- A. Y. Ng, A. X. Zheng, and M. I. Jordan. Stable algorithms for link analysis. In SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 258--266, 2001. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford University, 1998.Google Scholar
- D. Petkova and W. B. Croft. Proximity-based document representation for named entity retrieval. In CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 731--740, 2007. Google ScholarDigital Library
- M. Richardson and P. Domingos. The intelligent surfer: Probabilistic combination of link and content information in pagerank. In NIPS '01: Advances in Neural Information Processing Systems, 2001.Google Scholar
- P. Serdyukov and D. Hiemsta. Being omnipresent to be almighty: The importance of the global web evidence for organizational expert finding. In In FCHER'08: Proceedings of the SIGIR'08 Workshop on Future Challenges in Expertise Retrieval, 2008.Google Scholar
- P. Serdyukov and D. Hiemstra. Modeling documents as mixtures of persons for expert finding. In ECIR, pages 309--320, 2008. Google ScholarDigital Library
- P. Serdyukov, D. Hiemstra, M. Fokkinga, and P. M. G. Apers. Generative modeling of persons and documents for expert search. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 827--828, 2007. Google ScholarDigital Library
- P. Serdyukov, H. Rode, and D. Hiemsta. Exploiting sequential dependencies for expert finding. In SIGIR '08: Proceedings of the 31th annual international ACM SIGIR conference on Research and development in information retrieval, 2008. Google ScholarDigital Library
- P. Serdyukov, H. Rode, and D. Hiemsta. Modeling expert finding as an absorbing random walk. In SIGIR '08: Proceedings of the 31th annual international ACM SIGIR conference on Research and development in information retrieval, 2008. Google ScholarDigital Library
- A. Shakery and C. Zhai. A probabilistic relevance propagation model for hypertext retrieval. In CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management, pages 550--558, 2006. Google ScholarDigital Library
- X. Song, B. L. Tseng, C.-Y. Lin, and M.-T. Sun. Personalized recommendation driven by information flow. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 509--516, 2006. Google ScholarDigital Library
- K. Toutanova, C. D. Manning, and A. Y. Ng. Learning random walk models for inducing word dependency distributions. In ICML '04: Proceedings of the twenty-first international conference on Machine learning, page 103, 2004. Google ScholarDigital Library
- T. Tsikrika, P. Serdyukov, H. Rode, T. Westerveld, R. Aly, D. Hiemstra, and A. de Vries. Structured Document Retrieval, Multimedia Retrieval, and Entity Ranking using PF/Tijah. In INEX 2007, 2007.Google Scholar
- H. Zaragoza, H. Rode, P. Mika, J. Atserias, M. Ciaramita, and G. Attardi. Ranking very many typed entities on wikipedia. In CIKM '07, Lisbon, Portugal, 2007. Google ScholarDigital Library
- J. Zhang, M. S. Ackerman, and L. Adamic. Expertise networks in online communities: structure and algorithms. In WWW '07: Proceedings of the 16th international conference on World Wide Web, pages 221--230, 2007. Google ScholarDigital Library
Index Terms
- Modeling multi-step relevance propagation for expert finding
Recommendations
Modeling expert finding as an absorbing random walk
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrievalWe introduce a novel approach to expert finding based on multi-step relevance propagation from documents to related candidates. Relevance propagation is modeled with an absorbing random walk. The evaluation on the two official Enterprise TREC data sets ...
Non-local evidence for expert finding
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge managementThe task addressed in this paper, finding experts in an enterprise setting, has gained in importance and interest over the past few years. Commonly, this task is approached as an association finding exercise between people and topics. Existing ...
A study of the relationship between ad hoc retrieval and expert finding in enterprise environment
WIDM '08: Proceedings of the 10th ACM workshop on Web information and data managementAd hoc retrieval returns a ranked list of documents in response to a search query, while expert finding returns a ranked list of people in response to an expertise request in the form of a search query, e.g., "information retrieval". In current state of ...
Comments