ABSTRACT
This article describes approaches for searching opinionated documents for a given query from a standard data collection. To detect if a text is opinionated (i.e., contain subjective information) or not, we propose two methods: the first method is based on lexicons of subjective words (i.e., SentiWordNet) supported by the assumption that more a document contains the subjective terms more it has the tendency of being an opinionated document while the second method is based on probabilistic model supporting the idea that given a document having a strong similarity with a reference opinionated text is more likely to be opinionated. In the second method, we take support of language modeling approach to compute this similarity. Experiments are conducted with TREC Blog06 as the test collection and the IMDB data collection as being the reference data collection. The experimental results report effectiveness of both methods.
- A. Esuli and F. Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC'06, pages 417--422, Genova, 2006.Google Scholar
- J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In Proceedings of, SIGIR '01, pages 111--119, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
- C. Macdonald and I. Ounis. The TREC Blogs06 collection: creating and analysing a blog test collection. Number TR-2006-224, 2006.Google Scholar
- I. Ounis, M. de Rijke, C. Macdonald, G. Mishne, and I. Soboroff. Overview of the trec-2006 blog track. In Text Retrieval Conference, 2006.Google Scholar
- S. E. Robertson, S. Walker, M. Hancock-Beaulieu, A. Gull, and M. Lau. Okapi at TREC. In Text Retrieval Conference, pages 21--30, 1992.Google Scholar
Index Terms
- Probabilistic opinion models based on subjective sources
Recommendations
Aggregation Methods for Proximity-Based Opinion Retrieval
The enormous amount of user-generated data available on the Web provides a great opportunity to understand, analyze, and exploit people’s opinions on different topics. Traditional Information Retrieval methods consider the relevance of documents to a ...
Proximity-based opinion retrieval
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrievalBlog post opinion retrieval aims at finding blog posts that are relevant and opinionated about a user's query. In this paper we propose a simple probabilistic model for assigning relevant opinion scores to documents. The key problem is how to capture ...
A unified relevance model for opinion retrieval
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementRepresenting the information need is the greatest challenge for opinion retrieval. Typical queries for opinion retrieval are composed of either just content words, or content words with a small number of cue "opinion" words. Both are inadequate for ...
Comments