skip to main content
10.1145/2484028.2484174acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Boosting novelty for biomedical information retrieval through probabilistic latent semantic analysis

Published:28 July 2013Publication History

ABSTRACT

In information retrieval, we are interested in the information that is not only relevant but also novel. In this paper, we study how to boost novelty for biomedical information retrieval through probabilistic latent semantic analysis. We conduct the study based on TREC Genomics Track data. In TREC Genomics Track, each topic is considered to have an arbitrary number of aspects, and the novelty of a piece of information retrieved, called a passage, is assessed based on the amount of new aspects it contains. In particular, the aspect performance of a ranked list is rewarded by the number of new aspects reached at each rank and penalized by the amount of irrelevant passages that are rated higher than the novel ones. Therefore, to improve aspect performance, we should reach as many aspects as possible and as early as possible. In this paper, we make a preliminary study on how probabilistic latent semantic analysis can help capture different aspects of a ranked list, and improve its performance by re-ranking. Experiments indicate that the proposed approach can greatly improve the aspect-level performance over baseline algorithm Okapi BM25.

References

  1. A. Asuncion and et al. On smoothing and inference for topic models. In UAI'09, pages 27--34, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. M. Blei and et al. Latent dirichlet allocation. JMLR, 3(4-5):993--1022, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR'98, pages 335--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Clarke and et al. Novelty and diversity in information retrieval evaluation. In SIGIR'08, pages 659--666. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Deerwester and et al. Indexing by latent semantic analysis. JASIST, 41, 1990.Google ScholarGoogle Scholar
  6. D. Demner-Fushman and et al. Combining resources to find answers to biomedical questions. In TREC-2007, pages 205--214.Google ScholarGoogle Scholar
  7. A. B. Goldberg and et al. Ranking biomedical passages for relevance and diversity: University of Wisconsin, Madison at TREC genomics 2006. In TREC-2006, pages 129--136.Google ScholarGoogle Scholar
  8. W. Hersh, A. Cohen, and P. Roberts. TREC 2007 genomics track overview. In TREC-2007, pages 98--115.Google ScholarGoogle Scholar
  9. T. Hofmann. Probabilistic latent semantic analysis. In UAI'99, pages 289--296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Q. Hu and X. Huang. A reranking model for genomics aspect search. In SIGIR'08, pages 783--784. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. X. Huang and et al. A platform for okapi-based contextual information retrieval. In SIGIR'06, pages 728--728, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. X. Huang and Q. Hu. A bayesian learning approach to promoting diversity in ranking for biomedical informaiton retrieval. In SIGIR'09, pages 307--314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. X. Yin and et al. Survival modeling approach to biomedical search result diversification using wikipedia. TKDE, 25(6):1201--1212, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In SIGIR'03, pages 10--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Zhang and et al. Improving web search results using affinity graph. In SIGIR'05, pages 504--511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In SIGIR'02, pages 81--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. W. Zhou and C. Yu. TREC genomics track at UIC. In TREC-2007, pages 221--226.Google ScholarGoogle Scholar
  18. X. Zhu and et al. Improving diversity in ranking using absorbing random walks. In NAACL-HLT 2007, pages 97--104.Google ScholarGoogle Scholar

Index Terms

  1. Boosting novelty for biomedical information retrieval through probabilistic latent semantic analysis

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
      July 2013
      1188 pages
      ISBN:9781450320344
      DOI:10.1145/2484028

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 July 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper

      Acceptance Rates

      SIGIR '13 Paper Acceptance Rate73of366submissions,20%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader