ABSTRACT
Opinion mining consists in extracting from a text opinions expressed by its author and their polarity. Lexical resources, such as polarized lexicons, are needed for this task. Opinion mining in the medical domain has not been well explored, partly because little credence is given to patients and their opinions (although more and more of them are using social media). We are interested in opinion mining of user-generated content on drugs/medication. We present in this paper the creation of our lexical resources and their adaptation to the medical domain. We first describe the creation of a general lexicon, containing opinion words from the general domain and their polarity. Then we present the creation of a medical opinion lexicon, based on a corpus of drug reviews. We show that some words have a different polarity in the general domain and in the medical one. Some words considered generally as neutral are opinionated in medical texts. We finally evaluate the lexicons and show with a simple algorithm that using our general lexicon gives better results than other well-known ones on our corpus and that adding the domain lexicon improves them as well.
- S. Baccianella, A. Esuli, and F. Sebastiani. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the 7th Conference on Language Resources and Evaluation, pages 2200--2204, 2010.Google Scholar
- K. Denecke and W. Nejdl. How valuable is medical social media data? content analysis of the medical web. Journal of Information Sciences, 179(12):1870--1880, 2009. Google ScholarDigital Library
- W. Du and S. Tan. Building domain-oriented sentiment lexicon by improved information bottleneck. In Proceedings of the International Conference on Information and Knowledge Management, pages 1749--1752, 2009. Google ScholarDigital Library
- W. Du, S. Tan, X. Cheng, and X. Yun. Adapting information bottleneck method for automatic construction of domain-oriented sentiment lexicon. In Proceedings of the third ACM International Conference on Web Search and Data Mining, pages 111--119, 2010. Google ScholarDigital Library
- L. Goeuriot, J.-C. Na, W. Y. M. Kyaing, S. Foo, C. Khoo, Y.-L. Theng, and Y.-K. Chang. Textual and informational characteristics of health-related social media content: A study of drug review forums. In Proceedings of the Asia-Pacific Conference on Library and Information Education and Practice: Issues, Challenges and Opportunities, pages 548--557, 2011.Google Scholar
- L. Goeuriot, J.-C. Na, W. Y. M. Kyaing, C. Khoo, Y.-L. Theng, Y.-K. Chang, and S. Foo. Textual and informational characteristics of drug-related content on three types of websites: Drug reviews forum, discussion board and medical portal. International Journal of Organizational and Collective Intelligence on "Social Media in E-Health: Emergent Issues in Ethics, Trust and Privacy", 2011. In press.Google Scholar
- W. Himmel, U. Reincke, and H. W. Michelmann. Using text mining to classify lay requests to a medical expert forum and to prepare semiautomatic answers. Journal of Medical Internet Research, 11(3):e25, 2008.Google ScholarCross Ref
- A. Jaloba. The club no one wants to join: Online behaviour on a breast cancer discussion forum. First Monday {Online}, 14(7), 2009.Google Scholar
- H. Kanayama and T. Nasukawa. Fully automatic lexicon expansion for domain-oriented sentiment analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 355--363, 2006. Google ScholarDigital Library
- X. Ma, G. Chen, and J. Xiao. Analysis on an online health social network. In Proceedings of the 1st ACM International Health Informatics Symposium, pages 297--306, 2010. Google ScholarDigital Library
- G. A. Miller. Wordnet: A lexical database for english. Communications of the ACM, 38(11):39--41, 1995. Google ScholarDigital Library
- A. Névéol and Z. Lu. Automatic integration of drug indications from multiple health resources. In Proceedings of the 1st ACM International Health Informatics Symposium, pages 666--673, 2010. Google ScholarDigital Library
- J. Sarasohn-Kahn. The wisdom of patients: Health care meets online social media. Technical report, California Healthcare Foundation, 2008.Google Scholar
- M. C. Schraefel, R. W. White, P. André, and D. Tan. Investigating web search strategies and forum use to support diet and weight loss. In Proceedings of the 27th international conference extended abstracts on Human factors in computing systems, pages 3829--3834, 2009. Google ScholarDigital Library
- T. T. Thet, J.-C. Na, and C. Khoo. Aspect-based sentiment analysis of movie reviews on discussion boards. Journal of Information Science, 36(6):823--848, 2010. Google ScholarDigital Library
- L. Velikovitch, S. Blair-Goldenshon, and R. McDonald. The viability of web-derived polarity lexicons. In Proceedings of Human Language Technologies Conference, pages 777--785, 2010. Google ScholarDigital Library
- T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of Human Language Technologies Conference/Conference on Empirical Methods in Natural Language Processing, pages 347--354, 2005. Google ScholarDigital Library
- L. Xia, A. L. Gentile, J. Munro, and J. Iria. Improving patient opinion mining through multi-step classification. Lecture Notes in Artificial Intelligence, 5729:70--76, 2009. Google ScholarDigital Library
Index Terms
Sentiment lexicons for health-related opinion mining
Recommendations
Word sense disambiguation based sentiment lexicons for sentiment classification
Sentiment analysis has attracted much attention from both researchers and practitioners as word-of-mouth (WOM) has a significant influence on consumer behavior. One core task of sentiment analysis is the discovery of sentimental words. This can be done ...
Adapting sentiment lexicons to domain-specific social media texts
Social media has become the largest data source of public opinion. The application of sentiment analysis to social media texts has great potential, but faces great challenges because of domain heterogeneity. Sentiment orientation of words varies by ...
Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementAspect-based opinion mining is widely applied to review data to aggregate or summarize opinions of a product, and the current state-of-the-art is achieved with Latent Dirichlet Allocation (LDA)-based model. Although social media data like tweets are ...
Comments