skip to main content
10.1145/3209542.3209562acmconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
research-article

Detecting the Correlation between Sentiment and User-level as well as Text-Level Meta-data from Benchmark Corpora

Published: 03 July 2018 Publication History

Abstract

Do tweets from users with similar Twitter characteristics have similar sentiments? What meta-data features of tweets and users correlate with tweet sentiment? In this paper, we address these two questions by analyzing six popular benchmark datasets where tweets are annotated with sentiment labels. We consider user-level as well as tweet-level meta-data features, and identify patterns and correlations of these feature with the log-odds for sentiment classes. We further strengthen our analysis by replicating this set of experiments on recent tweets from users present in our datasets; finding that most of the patterns are consistent across our analysis. Finally, we use our identified meta-data features as features for a sentiment classification algorithm, which results in around 2% increase in F1 score for sentiment classification, compared to text-only classifiers, along with a significant drop in KL-divergence. These results have potential to improve sentiment analysis applications on social media data.

References

[1]
Ahmed Abbasi, Ammar Hassan, and Milan Dhar. 2014. Benchmarking Twitter Sentiment Analysis Tools. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association (ELRA), Reykjavik, Iceland. http://www.aclweb.org/anthology/L14--1406
[2]
Sitaram Asur and Bernardo A. Huberman. 2010. Predicting the Future with Social Media. In 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. IEEE, Toronto, Canada.
[3]
Johan Bollen, Huina Mao, and Alberto Pepe. 2011. Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (ICWSM '11). AAAI, Barcelona, Spain. https://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2826
[4]
Rui Fan, Jichang Zhao, Yan Chen, and Ke Xu. 2014. Anger Is More Influential than Joy: Sentiment Correlation in Weibo. PLOS ONE Vol. 9, 10 (10. 2014), 1--8.
[5]
W. Gao and F. Sebastiani. 2015. Tweet sentiment: From classification to quantification 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM '15). IEEE, Paris, France, 97--104.
[6]
Alec Go, Richa Bhayani, and Lei Huang. 2009. Twitter Sentiment Classification using Distant Supervision. (2009). https://www.semanticscholar.org/paper/Twitter-Sentiment-Classification-using-Distant-Go-Bhayani/52e2bd533323ddf97073d034bae40a46eda55f34
[7]
C. Hutto and Eric Gilbert. 2014. VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. In Proceedings of the Eight International AAAI Conference on Weblogs and Social Media (ICWSM '14). AAAI, Ann Arbor, Michigan, USA. https://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8109
[8]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP '14). Association for Computational Linguistics, Doha, Qatar, 1746--1751.
[9]
Richard A Leibler and S Kullback. 1951. On information and sufficiency. Annals of Mathematical Statistics Vol. 22, 1 (1951), 79--86.
[10]
Bing Liu. 2011. Opinion Mining and Sentiment Analysis. Springer-Verlag Berlin Heidelberg, Berlin, Heidelberg. 459--526 pages.
[11]
M. M. Masud, Q. Chen, L. Khan, C. Aggarwal, J. Gao, J. Han, and B. Thuraisingham. 2010. Addressing Concept-Evolution in Concept-Drifting Data Streams 2010 IEEE International Conference on Data Mining. 929--934.
[12]
Shubhanshu Mishra, Sneha Agarwal, Jinlong Guo, Kirstin Phelps, Johna Picco, and Jana Diesner. 2014. Enthusiasm and support: alternative sentiment classification for social movements on social media. In Proceedings of the 2014 ACM conference on WebScience (WebSci '14). ACM Press, Bloomington, Indiana, USA, 261--262.
[13]
Shubhanshu Mishra, Jana Diesner, Jason Byrne, and Elizabeth Surbeck. 2015. Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (HT '15). Guzelyurt, TRNC, Cyprus, 323--325.
[14]
Igor Mozetivc, Miha Grvcar, and Jasmina Smailović. 2016. Multilingual Twitter Sentiment Classification: The Role of Human Annotators. PLOS ONE Vol. 11, 5 (05. 2016), 1--26.
[15]
Preslav Nakov, Alan Ritter, Sara Rosenthal, Fabrizio Sebastiani, and Veselin Stoyanov. 2016 a. SemEval-2016 Task 4: Sentiment Analysis in Twitter Proceedings of the Tenth International Workshop on Semantic Evaluation (SemEval '16). Association for Computational Linguistics, Stroudsburg, PA, USA, 1--18.
[16]
Preslav Nakov, Sara Rosenthal, Svetlana Kiritchenko, Saif M. Mohammad, Zornitsa Kozareva, Alan Ritter, Veselin Stoyanov, and Xiaodan Zhu. 2016 b. Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts. Language Resources and Evaluation Vol. 50, 1 (jan. 2016), 35--65.
[17]
Preslav Nakov, Sara Rosenthal, Zornitsa Kozareva, Veselin Stoyanov, Alan Ritter, and Theresa Wilson. 2013 a. SemEval-2013 Task 2: Sentiment Analysis in Twitter Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval'13). Association for Computational Linguistics, Atlanta, Georgia, USA, 312--320. http://www.aclweb.org/anthology/S13--2052
[18]
Preslav Nakov, Sara Rosenthal, Zornitsa Kozareva, Veselin Stoyanov, Alan Ritter, and Theresa Wilson. 2013 b. SemEval-2013 Task 2: Sentiment Analysis in Twitter Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval '13). Association for Computational Linguistics, Atlanta, Georgia, USA, 312--320. http://www.aclweb.org/anthology/S13--2052
[19]
Alexander Pak and Patrick Paroubek. 2010. Twitter as a Corpus for Sentiment Analysis and Opinion Mining Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10). European Languages Resources Association (ELRA), Valletta, Malta. http://www.aclweb.org/anthology/L10--1263
[20]
Bo Pang and Lillian Lee. 2008. Opinion Mining and Sentiment Analysis. Found. Trends Inf. Retr. Vol. 2, 1--2 (Jan. 2008), 1--135.
[21]
Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP '02). Philadelphia, PA, USA.
[22]
Sara Rosenthal, Preslav Nakov, Svetlana Kiritchenko, Saif Mohammad, Alan Ritter, and Veselin Stoyanov. 2015. SemEval-2015 Task 10: Sentiment Analysis in Twitter Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval '15). Association for Computational Linguistics, Denver, Colorado, USA, 451--463.
[23]
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP '13). Association for Computational Linguistics, Seattle, Washington, USA, 1631--1642. http://www.aclweb.org/anthology/D13--1170
[24]
Chenhao Tan, Lillian Lee, Jie Tang, Long Jiang, Ming Zhou, and Ping Li. 2011. User-level Sentiment Analysis Incorporating Social Networks Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '11) (KDD '11). ACM, San Diego, California, USA, 1397--1405.
[25]
Andranik Tumasjan, Timm Sprenger, Philipp Sandner, and Isabell Welpe. 2010. Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment. In Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media (ICWSM '10). AAAI, Washington, DC, USA. https://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/view/1441
[26]
Soroush Vosoughi, Helen Zhou, and deb roy. 2015. Enhanced Twitter Sentiment Classification Using Contextual Information Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, Lisboa, Portugal, 16--24.

Cited By

View all
  • (2023)An Efficient Sentiment Classification Method with the Help of Neighbors and a Hybrid of RNN ModelsComplexity10.1155/2023/18965562023Online publication date: 1-Jan-2023
  • (2022)Information Extraction from Social MediaProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557503(5148-5151)Online publication date: 17-Oct-2022
  • (2022)Information Extraction from Social Media: A Hands-On Tutorial on Tasks, Data, and Open Source ToolsAdvances in Information Retrieval10.1007/978-3-030-99739-7_74(589-596)Online publication date: 10-Apr-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HT '18: Proceedings of the 29th on Hypertext and Social Media
July 2018
266 pages
ISBN:9781450354271
DOI:10.1145/3209542
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 July 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. sentiment analysis
  2. social media data
  3. social media meta-data
  4. statistical analysis

Qualifiers

  • Research-article

Conference

HT '18
Sponsor:

Acceptance Rates

HT '18 Paper Acceptance Rate 19 of 69 submissions, 28%;
Overall Acceptance Rate 378 of 1,158 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)An Efficient Sentiment Classification Method with the Help of Neighbors and a Hybrid of RNN ModelsComplexity10.1155/2023/18965562023Online publication date: 1-Jan-2023
  • (2022)Information Extraction from Social MediaProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557503(5148-5151)Online publication date: 17-Oct-2022
  • (2022)Information Extraction from Social Media: A Hands-On Tutorial on Tasks, Data, and Open Source ToolsAdvances in Information Retrieval10.1007/978-3-030-99739-7_74(589-596)Online publication date: 10-Apr-2022
  • (2021)Information extraction from digital social trace data with applications to social media and scholarly communication dataACM SIGIR Forum10.1145/3451964.345198154:1(1-2)Online publication date: 19-Feb-2021
  • (2021)Over a decade of social opinion mining: a systematic reviewArtificial Intelligence Review10.1007/s10462-021-10030-2Online publication date: 25-Jun-2021
  • (2020)ALBERT-based fine-tuning model for cyberbullying analysisMultimedia Systems10.1007/s00530-020-00690-528:6(1941-1949)Online publication date: 18-Sep-2020
  • (2019)Multi-dataset-multi-task Neural Sequence Tagging for Information Extraction from TweetsProceedings of the 30th ACM Conference on Hypertext and Social Media10.1145/3342220.3344929(283-284)Online publication date: 12-Sep-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media