research-article

Detecting the Correlation between Sentiment and User-level as well as Text-Level Meta-data from Benchmark Corpora

Authors:

Shubhanshu Mishra,

Jana DiesnerAuthors Info & Claims

HT '18: Proceedings of the 29th on Hypertext and Social Media

Pages 2 - 10

https://doi.org/10.1145/3209542.3209562

Published: 03 July 2018 Publication History

Abstract

Do tweets from users with similar Twitter characteristics have similar sentiments? What meta-data features of tweets and users correlate with tweet sentiment? In this paper, we address these two questions by analyzing six popular benchmark datasets where tweets are annotated with sentiment labels. We consider user-level as well as tweet-level meta-data features, and identify patterns and correlations of these feature with the log-odds for sentiment classes. We further strengthen our analysis by replicating this set of experiments on recent tweets from users present in our datasets; finding that most of the patterns are consistent across our analysis. Finally, we use our identified meta-data features as features for a sentiment classification algorithm, which results in around 2% increase in F1 score for sentiment classification, compared to text-only classifiers, along with a significant drop in KL-divergence. These results have potential to improve sentiment analysis applications on social media data.

References

[1]

Ahmed Abbasi, Ammar Hassan, and Milan Dhar. 2014. Benchmarking Twitter Sentiment Analysis Tools. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association (ELRA), Reykjavik, Iceland. http://www.aclweb.org/anthology/L14--1406

[2]

Sitaram Asur and Bernardo A. Huberman. 2010. Predicting the Future with Social Media. In 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. IEEE, Toronto, Canada.

Digital Library

[3]

Johan Bollen, Huina Mao, and Alberto Pepe. 2011. Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (ICWSM '11). AAAI, Barcelona, Spain. https://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2826

[4]

Rui Fan, Jichang Zhao, Yan Chen, and Ke Xu. 2014. Anger Is More Influential than Joy: Sentiment Correlation in Weibo. PLOS ONE Vol. 9, 10 (10. 2014), 1--8.

[5]

W. Gao and F. Sebastiani. 2015. Tweet sentiment: From classification to quantification 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM '15). IEEE, Paris, France, 97--104.

Digital Library

[6]

Alec Go, Richa Bhayani, and Lei Huang. 2009. Twitter Sentiment Classification using Distant Supervision. (2009). https://www.semanticscholar.org/paper/Twitter-Sentiment-Classification-using-Distant-Go-Bhayani/52e2bd533323ddf97073d034bae40a46eda55f34

[7]

C. Hutto and Eric Gilbert. 2014. VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. In Proceedings of the Eight International AAAI Conference on Weblogs and Social Media (ICWSM '14). AAAI, Ann Arbor, Michigan, USA. https://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8109

[8]

Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP '14). Association for Computational Linguistics, Doha, Qatar, 1746--1751.

[9]

Richard A Leibler and S Kullback. 1951. On information and sufficiency. Annals of Mathematical Statistics Vol. 22, 1 (1951), 79--86.

[10]

Bing Liu. 2011. Opinion Mining and Sentiment Analysis. Springer-Verlag Berlin Heidelberg, Berlin, Heidelberg. 459--526 pages.

[11]

M. M. Masud, Q. Chen, L. Khan, C. Aggarwal, J. Gao, J. Han, and B. Thuraisingham. 2010. Addressing Concept-Evolution in Concept-Drifting Data Streams 2010 IEEE International Conference on Data Mining. 929--934.

Digital Library

[12]

Shubhanshu Mishra, Sneha Agarwal, Jinlong Guo, Kirstin Phelps, Johna Picco, and Jana Diesner. 2014. Enthusiasm and support: alternative sentiment classification for social movements on social media. In Proceedings of the 2014 ACM conference on WebScience (WebSci '14). ACM Press, Bloomington, Indiana, USA, 261--262.

Digital Library

[13]

Shubhanshu Mishra, Jana Diesner, Jason Byrne, and Elizabeth Surbeck. 2015. Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (HT '15). Guzelyurt, TRNC, Cyprus, 323--325.

Digital Library

[14]

Igor Mozetivc, Miha Grvcar, and Jasmina Smailović. 2016. Multilingual Twitter Sentiment Classification: The Role of Human Annotators. PLOS ONE Vol. 11, 5 (05. 2016), 1--26.

[15]

Preslav Nakov, Alan Ritter, Sara Rosenthal, Fabrizio Sebastiani, and Veselin Stoyanov. 2016 a. SemEval-2016 Task 4: Sentiment Analysis in Twitter Proceedings of the Tenth International Workshop on Semantic Evaluation (SemEval '16). Association for Computational Linguistics, Stroudsburg, PA, USA, 1--18.

[16]

Preslav Nakov, Sara Rosenthal, Svetlana Kiritchenko, Saif M. Mohammad, Zornitsa Kozareva, Alan Ritter, Veselin Stoyanov, and Xiaodan Zhu. 2016 b. Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts. Language Resources and Evaluation Vol. 50, 1 (jan. 2016), 35--65.

Digital Library

[17]

Preslav Nakov, Sara Rosenthal, Zornitsa Kozareva, Veselin Stoyanov, Alan Ritter, and Theresa Wilson. 2013 a. SemEval-2013 Task 2: Sentiment Analysis in Twitter Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval'13). Association for Computational Linguistics, Atlanta, Georgia, USA, 312--320. http://www.aclweb.org/anthology/S13--2052

[18]

Preslav Nakov, Sara Rosenthal, Zornitsa Kozareva, Veselin Stoyanov, Alan Ritter, and Theresa Wilson. 2013 b. SemEval-2013 Task 2: Sentiment Analysis in Twitter Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval '13). Association for Computational Linguistics, Atlanta, Georgia, USA, 312--320. http://www.aclweb.org/anthology/S13--2052

[19]

Alexander Pak and Patrick Paroubek. 2010. Twitter as a Corpus for Sentiment Analysis and Opinion Mining Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10). European Languages Resources Association (ELRA), Valletta, Malta. http://www.aclweb.org/anthology/L10--1263

[20]

Bo Pang and Lillian Lee. 2008. Opinion Mining and Sentiment Analysis. Found. Trends Inf. Retr. Vol. 2, 1--2 (Jan. 2008), 1--135.

Digital Library

[21]

Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP '02). Philadelphia, PA, USA.

Digital Library

[22]

Sara Rosenthal, Preslav Nakov, Svetlana Kiritchenko, Saif Mohammad, Alan Ritter, and Veselin Stoyanov. 2015. SemEval-2015 Task 10: Sentiment Analysis in Twitter Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval '15). Association for Computational Linguistics, Denver, Colorado, USA, 451--463.

[23]

Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP '13). Association for Computational Linguistics, Seattle, Washington, USA, 1631--1642. http://www.aclweb.org/anthology/D13--1170

[24]

Chenhao Tan, Lillian Lee, Jie Tang, Long Jiang, Ming Zhou, and Ping Li. 2011. User-level Sentiment Analysis Incorporating Social Networks Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '11) (KDD '11). ACM, San Diego, California, USA, 1397--1405.

Digital Library

[25]

Andranik Tumasjan, Timm Sprenger, Philipp Sandner, and Isabell Welpe. 2010. Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment. In Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media (ICWSM '10). AAAI, Washington, DC, USA. https://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/view/1441

[26]

Soroush Vosoughi, Helen Zhou, and deb roy. 2015. Enhanced Twitter Sentiment Classification Using Contextual Information Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, Lisboa, Portugal, 16--24.

Cited By

Salman Al-Tameemi IFeizi-Derakhshi MPashazadeh SAsadpour M(2023)An Efficient Sentiment Classification Method with the Help of Neighbors and a Hybrid of RNN ModelsComplexity10.1155/2023/18965562023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/1896556
Mishra SRezapour RDiesner JAl Hasan MXiong L(2022)Information Extraction from Social MediaProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557503(5148-5151)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557503
Mishra SRezapour RDiesner J(2022)Information Extraction from Social Media: A Hands-On Tutorial on Tasks, Data, and Open Source ToolsAdvances in Information Retrieval10.1007/978-3-030-99739-7_74(589-596)Online publication date: 10-Apr-2022
https://dl.acm.org/doi/10.1007/978-3-030-99739-7_74
Show More Cited By

Index Terms

Detecting the Correlation between Sentiment and User-level as well as Text-Level Meta-data from Benchmark Corpora

Recommendations

Detecting bursts in sentiment-aware topics from social media

Nowadays plenty of user-generated posts, e.g., sina weibos, are published on the social media. The posts contain the publics sentiments (i.e., positive or negative) towards various topics. Bursty sentiment-aware topics from these posts reveal sentiment-...
Subtopic-Level Sentiment Analysis of Emergencies
Knowledge Science, Engineering and Management
Abstract
With the rapid development of microblog, millions of Internet users share their opinions on different aspects of daily life. By analyzing and monitoring sentiment information extracting from tweets related to an important event, we are able to ...
A document-level sentiment analysis approach using artificial neural network and sentiment lexicons

The abundance of discussion forums, Weblogs, e-commerce portals, social networking, product review sites and content sharing sites has facilitated flow of ideas and expression of opinions. The user-generated text content on Internet and Web 2.0 social ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HT '18: Proceedings of the 29th on Hypertext and Social Media

July 2018

266 pages

ISBN:9781450354271

DOI:10.1145/3209542

General Chair:
Dongwon Lee
Penn State University, USA
,
Program Chairs:
Nishanth Sastry
King's College London, UK
,
Ingmar Weber
QCRI, Qatar

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGCHI: Specialist Interest Group in Computer-Human Interaction of the ACM

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 July 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

HT '18

Sponsor:

SIGWEB
SIGCHI

HT '18: 29th ACM Conference on Hypertext and Social Media

July 9 - 12, 2018

MD, Baltimore, USA

Acceptance Rates

HT '18 Paper Acceptance Rate 19 of 69 submissions, 28%;

Overall Acceptance Rate 378 of 1,158 submissions, 33%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
208
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Salman Al-Tameemi IFeizi-Derakhshi MPashazadeh SAsadpour M(2023)An Efficient Sentiment Classification Method with the Help of Neighbors and a Hybrid of RNN ModelsComplexity10.1155/2023/18965562023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/1896556
Mishra SRezapour RDiesner JAl Hasan MXiong L(2022)Information Extraction from Social MediaProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557503(5148-5151)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557503
Mishra SRezapour RDiesner J(2022)Information Extraction from Social Media: A Hands-On Tutorial on Tasks, Data, and Open Source ToolsAdvances in Information Retrieval10.1007/978-3-030-99739-7_74(589-596)Online publication date: 10-Apr-2022
https://dl.acm.org/doi/10.1007/978-3-030-99739-7_74
Mishra S(2021)Information extraction from digital social trace data with applications to social media and scholarly communication dataACM SIGIR Forum10.1145/3451964.345198154:1(1-2)Online publication date: 19-Feb-2021
https://dl.acm.org/doi/10.1145/3451964.3451981
Cortis KDavis B(2021)Over a decade of social opinion mining: a systematic reviewArtificial Intelligence Review10.1007/s10462-021-10030-2Online publication date: 25-Jun-2021
https://doi.org/10.1007/s10462-021-10030-2
Tripathy JChakkaravarthy SSatapathy SSahoo MVaidehi V(2020)ALBERT-based fine-tuning model for cyberbullying analysisMultimedia Systems10.1007/s00530-020-00690-528:6(1941-1949)Online publication date: 18-Sep-2020
https://doi.org/10.1007/s00530-020-00690-5
Mishra SAtzenbeck CRubart JMillard D(2019)Multi-dataset-multi-task Neural Sequence Tagging for Information Extraction from TweetsProceedings of the 30th ACM Conference on Hypertext and Social Media10.1145/3342220.3344929(283-284)Online publication date: 12-Sep-2019
https://dl.acm.org/doi/10.1145/3342220.3344929

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten