ABSTRACT
In information retrieval, we are interested in the information that is not only relevant but also novel. In this paper, we study how to boost novelty for biomedical information retrieval through probabilistic latent semantic analysis. We conduct the study based on TREC Genomics Track data. In TREC Genomics Track, each topic is considered to have an arbitrary number of aspects, and the novelty of a piece of information retrieved, called a passage, is assessed based on the amount of new aspects it contains. In particular, the aspect performance of a ranked list is rewarded by the number of new aspects reached at each rank and penalized by the amount of irrelevant passages that are rated higher than the novel ones. Therefore, to improve aspect performance, we should reach as many aspects as possible and as early as possible. In this paper, we make a preliminary study on how probabilistic latent semantic analysis can help capture different aspects of a ranked list, and improve its performance by re-ranking. Experiments indicate that the proposed approach can greatly improve the aspect-level performance over baseline algorithm Okapi BM25.
- A. Asuncion and et al. On smoothing and inference for topic models. In UAI'09, pages 27--34, 2009. Google ScholarDigital Library
- D. M. Blei and et al. Latent dirichlet allocation. JMLR, 3(4-5):993--1022, 2003. Google ScholarDigital Library
- J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR'98, pages 335--336. Google ScholarDigital Library
- C. Clarke and et al. Novelty and diversity in information retrieval evaluation. In SIGIR'08, pages 659--666. Google ScholarDigital Library
- S. Deerwester and et al. Indexing by latent semantic analysis. JASIST, 41, 1990.Google Scholar
- D. Demner-Fushman and et al. Combining resources to find answers to biomedical questions. In TREC-2007, pages 205--214.Google Scholar
- A. B. Goldberg and et al. Ranking biomedical passages for relevance and diversity: University of Wisconsin, Madison at TREC genomics 2006. In TREC-2006, pages 129--136.Google Scholar
- W. Hersh, A. Cohen, and P. Roberts. TREC 2007 genomics track overview. In TREC-2007, pages 98--115.Google Scholar
- T. Hofmann. Probabilistic latent semantic analysis. In UAI'99, pages 289--296. Google ScholarDigital Library
- Q. Hu and X. Huang. A reranking model for genomics aspect search. In SIGIR'08, pages 783--784. Google ScholarDigital Library
- X. Huang and et al. A platform for okapi-based contextual information retrieval. In SIGIR'06, pages 728--728, 2006. Google ScholarDigital Library
- X. Huang and Q. Hu. A bayesian learning approach to promoting diversity in ranking for biomedical informaiton retrieval. In SIGIR'09, pages 307--314. Google ScholarDigital Library
- X. Yin and et al. Survival modeling approach to biomedical search result diversification using wikipedia. TKDE, 25(6):1201--1212, 2013. Google ScholarDigital Library
- C. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In SIGIR'03, pages 10--17. Google ScholarDigital Library
- B. Zhang and et al. Improving web search results using affinity graph. In SIGIR'05, pages 504--511. Google ScholarDigital Library
- Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In SIGIR'02, pages 81--88. Google ScholarDigital Library
- W. Zhou and C. Yu. TREC genomics track at UIC. In TREC-2007, pages 221--226.Google Scholar
- X. Zhu and et al. Improving diversity in ranking using absorbing random walks. In NAACL-HLT 2007, pages 97--104.Google Scholar
Index Terms
- Boosting novelty for biomedical information retrieval through probabilistic latent semantic analysis
Recommendations
Semantic image retrieval based on probabilistic latent semantic analysis
MM '06: Proceedings of the 14th ACM international conference on MultimediaContent-based image retrieval (CBIR) systems combine computer vision techniques and learning methodologies to find images in the database similar to the query images. Relevance feedback methods are introduced to the CBIR area as a tool to help the user ...
Enhancing relevance models with adaptive passage retrieval
ECIR'08: Proceedings of the IR research, 30th European conference on Advances in information retrievalPassage retrieval and pseudo relevance feedback/query expansion have been reported as two effective means for improving document retrieval in literature. Relevance models, while improving retrieval in most cases, hurts performance on some heterogeneous ...
Incremental probabilistic Latent Semantic Analysis for video retrieval
Recent research trends in Content-based Video Retrieval have shown topic models as an effective tool to deal with the semantic gap challenge. In this scenario, this paper has a dual target: (1) it is aimed at studying how the use of different topic ...
Comments