skip to main content
10.1145/2481492.2481506acmconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
research-article

Generating contextualized sentiment lexica based on latent topics and user ratings

Published: 01 May 2013 Publication History

Abstract

Sentiment lexica are useful for analyzing opinions in Web collections, for domain-dependent sentiment classification, and as sub-components of recommender systems. In this paper, we present a strategy for automatically generating topic-dependent lexica from large corpora of review articles by exploiting accompanying user ratings. Our approach combines text segmentation, discriminative feature analysis techniques, and latent topic extraction to infer the polarity of n-grams in a topical context. Our experiments on rating prediction demonstrate a substantial performance improvement in comparison with existing state-of-the-art sentiment lexica.

References

[1]
A. Andreevskaia and S. Bergler. When specialists and generalists work together: Overcoming domain dependence in sentiment tagging. In Proc. of the 46th Annual Meeting of the Association for Computational Linguistics, pages 290--298. ACL, 2008.
[2]
S. Baccianella, A. Esuli, and F. Sebastiani. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the International Conference on Language Resources and Evaluation, pages 2200--2204. ELRA, 2010.
[3]
D. Blei and J. McAuliffe. Supervised topic models. In Advances in Neural Information Processing Systems, 2007.
[4]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, 2003.
[5]
J. Blitzer, M. Dredze, and F. Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 440--447. ACL, 2007.
[6]
J. Bross and H. Ehrig. Generating a context-aware sentiment lexicon for aspect-based product review mining. In Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pages 435--439. IEEE CS, 2010.
[7]
Y. Choi and C. Cardie. Adapting a polarity lexicon using integer linear programming for domain-specific sentiment classification. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 590--598. ACL, 2009.
[8]
Y. Choi, Y. Kim, and S.-H. Myaeng. Domain-specific sentiment analysis using contextual feature generation. In Proc. of the 1st Intl. CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pages 37--44. ACM, 2009.
[9]
Y. Dang, Y. Zhang, and H. Chen. A lexicon-enhanced method for sentiment classification: An experiment on online product reviews. IEEE Intelligent Systems, 25:46--53, 2010.
[10]
K. Denecke. Are sentiwordnet scores suited for multi-domain sentiment classification? In 4th IEEE International Conference on Digital Information Management, pages 33--38. IEEE, 2009.
[11]
A. Esuli and F. Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of the 5th Conference on Language Resources and Evaluation, pages 417--422, 2006.
[12]
A. Fahrni and M. Klenner. Old wine or warm beer: Target-specific sentiment analysis of adjectives. In Proceedings of the Symposium on Affective Language in Human and Machine, pages 60--63, April 2008.
[13]
C. Fellbaum. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.
[14]
S. Gindl, A. Weichselbraun, and A. Scharl. Cross-domain contextualisation of sentiment lexicons. In 19th European Conference on Artificial Intelligence, volume 215 of Frontiers in Artificial Intelligence and Applications, pages 771--776. IOS Press, 2010.
[15]
T. L. Griffiths and M. Steyvers. Finding scientific topics. Proc Natl Acad Sci U S A, 101 Suppl 1:5228--5235, 2004.
[16]
V. Jijkoun, M. de Rijke, and W. Weerkamp. Generating focused topic-specific sentiment lexicons. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 585--594. ACL, 2010.
[17]
Y. Jo and A. H. Oh. Aspect and sentiment unification model for online review analysis. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining, pages 815--824. ACM, 2011.
[18]
N. Kaji and M. Kitsuregawa. Building lexicon for sentiment analysis from massive collection of HTML documents. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning), pages 1075--1083. ACL, 2007.
[19]
H. Kanayama and T. Nasukawa. Fully automatic lexicon expansion for domain-oriented sentiment analysis. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 355--363. ACL, 2006.
[20]
W. H. Kruskal. Ordinal measures of association. J. of the American Statistical Association, 53(284):814--861, 1958.
[21]
F. Li, M. Huang, and X. Zhu. Sentiment analysis with global topics and local dependency. In Proceedings of the 24th AAAI Conference on Artificial Intelligence. AAAI Press, 2010.
[22]
F. Li, S. J. Pan, O. Jin, Q. Yang, and X. Zhu. Cross-domain co-extraction of sentiment and topic lexicons. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pages 410--419. ACL, 2012.
[23]
Y. Liu, X. Huang, A. An, and X. Yu. Arsa: A sentiment-aware model for predicting sales performance using blogs. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 607--614. ACM, 2007.
[24]
Y. Lu, M. Castellanos, U. Dayal, and C. Zhai. Automatic construction of a context-aware sentiment lexicon: an optimization approach. In Proceedings of the 20th International Conference on World Wide Web, pages 347--356. ACM, 2011.
[25]
Y. Lu, C. Zhai, and N. Sundaresan. Rated aspect summarization of short comments. In Proceedings of the 18th International Conference on World Wide Web, pages 131--140. ACM, 2009.
[26]
A. K. McCallum. Mallet: A machine learning for language toolkit, 2002. http://mallet.cs.umass.edu.
[27]
S. Moghaddam and M. Ester. Ilda: interdependent lda model for learning latent aspects and their ratings from online product reviews. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information, pages 665--674. ACM, 2011.
[28]
S. Nowson. Scary films good, scary flights bad: Topic driven feature selection for classification of sentiment. In Proceeding of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pages 17--24. ACM, 2009.
[29]
S. J. Pan, X. Ni, J.-T. Sun, Q. Yang, and Z. Chen. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on World Wide Web, pages 751--760. ACM, 2010.
[30]
B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, pages 79--86. ACL, 2002.
[31]
J. C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers, pages 61--74. MIT Press, 1999.
[32]
G. Qiu, B. Liu, J. Bu, and C. Chen. Expanding domain sentiment lexicon through double propagation. In Proc. of the 21st International Jont Conference on Artifical Intelligence, pages 1199--1204. Morgan Kaufmann, 2009.
[33]
L. Qu, G. Ifrim, and G. Weikum. The bag-of-opinions method for review rating prediction from sparse text patterns. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 913--921, Beijing, China, August 2010. ACL.
[34]
E. Riloff, J. Wiebe, and T. Wilson. Learning subjective nouns using extraction pattern bootstrapping. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003, pages 25--32. ACL, 2003.
[35]
I. Titov and R. T. McDonald. A joint model of text and aspect ratings for sentiment summarization. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pages 308--316. ACL, 2008.
[36]
P. D. Turney. Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pages 417--424. ACL, 2002.
[37]
P. D. Turney and M. L. Littman. Measuring praise and criticism: Inference of semantic orientation from association. ACM Trans. Inf. Syst., 21(4):315--346, 2003.
[38]
L. Velikovich, S. Blair-Goldensohn, K. Hannan, and R. McDonald. The viability of web-derived polarity lexicons. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 777--785. ACL, 2010.
[39]
C. Whitelaw, N. Garg, and S. Argamon. Using appraisal groups for sentiment analysis. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pages 625--631. ACM, 2005.
[40]
J. Wiebe, T. Wilson, and C. Cardie. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation, 39(2-3):165--210, 2005.
[41]
T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Computational Linguistics, 35(3):399--433, 2009.
[42]
R. Xia and C. Zong. A pos-based ensemble model for cross-domain sentiment classification. In Proceedings of 5th International Joint Conference on Natural Language Processing, pages 614--622. AFNLP, 2011.
[43]
C. Yang, K. H.-Y. Lin, and H.-H. Chen. Building emotion lexicon from weblog corpora. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pages 133--136. ACL, 2007.
[44]
Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning, pages 412--420. Morgan Kaufmann, 1997.
[45]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 334--342. ACM, 2001.
[46]
Z. Zheng, X. Wu, and R. Srihari. Feature selection for text categorization on imbalanced data. SIGKDD Explor. Newsl., 6(1):80--89, 2004.

Cited By

View all
  • (2024)An Empirical Study on Sentiment Intensity Analysis via Reading Comprehension ModelsProceedings of the 1st ACM Multimedia Workshop on Multi-modal Misinformation Governance in the Era of Foundation Models10.1145/3689090.3689390(23-28)Online publication date: 28-Oct-2024
  • (2023)Khmer Sentiment Lexicon Based on PU Learning and Label Propagation AlgorithmACM Transactions on Asian and Low-Resource Language Information Processing10.1145/356469722:3(1-18)Online publication date: 10-Mar-2023
  • (2021)Suicidality Detection on Social Media Using Metadata and Text Feature Extraction and Machine LearningArchives of Suicide Research10.1080/13811118.2021.195578327:1(13-28)Online publication date: 28-Jul-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HT '13: Proceedings of the 24th ACM Conference on Hypertext and Social Media
May 2013
275 pages
ISBN:9781450319676
DOI:10.1145/2481492
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. latent dirichlet allocation
  2. rating prediction
  3. sentiment analysis
  4. sentiment lexica
  5. topic models

Qualifiers

  • Research-article

Funding Sources

Conference

HT '13
Sponsor:

Acceptance Rates

HT '13 Paper Acceptance Rate 16 of 96 submissions, 17%;
Overall Acceptance Rate 378 of 1,158 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)An Empirical Study on Sentiment Intensity Analysis via Reading Comprehension ModelsProceedings of the 1st ACM Multimedia Workshop on Multi-modal Misinformation Governance in the Era of Foundation Models10.1145/3689090.3689390(23-28)Online publication date: 28-Oct-2024
  • (2023)Khmer Sentiment Lexicon Based on PU Learning and Label Propagation AlgorithmACM Transactions on Asian and Low-Resource Language Information Processing10.1145/356469722:3(1-18)Online publication date: 10-Mar-2023
  • (2021)Suicidality Detection on Social Media Using Metadata and Text Feature Extraction and Machine LearningArchives of Suicide Research10.1080/13811118.2021.195578327:1(13-28)Online publication date: 28-Jul-2021
  • (2021)A Deep Learning-Based Approach to Constructing a Domain Sentiment LexiconInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10267358:5Online publication date: 1-Sep-2021
  • (2020)A Sentiment Analysis Algorithm of Danmaku Based on Building a Mixed Fine-grained Sentiment LexiconProceedings of the 2020 9th International Conference on Computing and Pattern Recognition10.1145/3436369.3437406(424-430)Online publication date: 30-Oct-2020
  • (2019)OpinionML—Opinion Markup Language for Sentiment RepresentationSymmetry10.3390/sym1104054511:4(545)Online publication date: 15-Apr-2019
  • (2019)Depressing-domain Lexicon Based on Microblogs: Automatic Construction (Preprint)JMIR Medical Informatics10.2196/17650Online publication date: 31-Dec-2019
  • (2018)Detection of suicide-related posts in Twitter data streamsIBM Journal of Research and Development10.1147/JRD.2017.276867862:1(7:1-7:12)Online publication date: 1-Jan-2018
  • (2018)Sentence Emotion Classification for Intelligent Robotics Based on Word Lexicon and Emoticon Emotions2018 IEEE International Conference of Intelligent Robotic and Control Engineering (IRCE)10.1109/IRCE.2018.8492969(38-41)Online publication date: Aug-2018
  • (2018)Study and Analysis of Demonetization Move Taken by Indian Prime Minister Mr. Narendra ModiProceedings of First International Conference on Smart System, Innovations and Computing10.1007/978-981-10-5828-8_68(721-727)Online publication date: 9-Jan-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media