research-article

Mean Birds: Detecting Aggression and Bullying on Twitter

Authors:
Despoina Chatzakou

Aristotle University of Thessaloniki, Thessaloniki, Greece

Aristotle University of Thessaloniki, Thessaloniki, Greece
View Profile

,
Nicolas Kourtellis

Telefonica Research, Barcelona, Spain

Telefonica Research, Barcelona, Spain
View Profile

,
Jeremy Blackburn

Telefonica Research, Barcelona, Spain

Telefonica Research, Barcelona, Spain
View Profile

,
Emiliano De Cristofaro

University College London, London, United Kingdom

University College London, London, United Kingdom
View Profile

,
Gianluca Stringhini

University College London, London, United Kingdom

University College London, London, United Kingdom
View Profile

,
Athena Vakali

Aristotle University of Thessaloniki, Thessaloniki, Greece

Aristotle University of Thessaloniki, Thessaloniki, Greece
View Profile

WebSci '17: Proceedings of the 2017 ACM on Web Science ConferenceJune 2017Pages 13–22https://doi.org/10.1145/3091478.3091487

Published:25 June 2017Publication History

WebSci '17: Proceedings of the 2017 ACM on Web Science Conference

Pages 13–22

ABSTRACT

In recent years, bullying and aggression against social media users have grown significantly, causing serious consequences to victims of all demographics. Nowadays, cyberbullying affects more than half of young social media users worldwide, suffering from prolonged and/or coordinated digital harassment. Also, tools and technologies geared to understand and mitigate it are scarce and mostly ineffective. In this paper, we present a principled and scalable approach to detect bullying and aggressive behavior on Twitter. We propose a robust methodology for extracting text, user, and network-based attributes, studying the properties of bullies and aggressors, and what features distinguish them from regular users. We find that bullies post less, participate in fewer online communities, and are less popular than normal users. Aggressors are relatively popular and tend to include more negativity in their posts. We evaluate our methodology using a corpus of 1.6M tweets posted over 3 months, and show that machine learning classification algorithms can accurately detect users exhibiting bullying and aggressive behavior, with over 90% AUC.

References

About suspended accounts. 2017. (2017). goo.gl/asJgpq.Google Scholar
Jeremy Blackburn, Ramanuja Simha, Nicolas Kourtellis, Xiang Zuo, Matei Ripeanu, John Skvoretz, and Adriana Iamnitchi. 2012. Branded with a scarlet "C": cheaters in a gaming social network. In WWW. Google ScholarDigital Library
V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre. 2011. The Louvain method for community detection in large networks. Statistical Mechanics: Theory and Experiment 10 (2011).Google Scholar
Cyberbullying Research Center. 2016. (26 November 2016). goo.gl/7kzSY0.Google Scholar
Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, and Athena Vakali. 2017. Hate is not Binary: Studying Abusive Behavior of #GamerGate on Twitter. In ACM Hypertext. Google ScholarDigital Library
Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, and Athena Vakali. 2017. Measuring #GamerGate: A Tale of Hate, Sexism, and Bullying. In WWW CyberSafety Workshop. Google ScholarDigital Library
Despoina Chatzakou, Vassiliki Koutsonikola, Athena Vakali, and Konstantinos Kafetsios. 2013. Micro-blogging Content Analysis via Emotionally-Driven Clustering. In ACII. Google ScholarDigital Library
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Artificial Intelligence Research 16, 1 (2002). Google ScholarDigital Library
Chao Chen, Andy Liaw, and Leo Breiman. 2004. Using random forest to learn imbalanced data. University of California, Berkeley (2004).Google Scholar
C. Chen, J. Zhang, X. Chen, Y. Xiang, and W. Zhou. 2015. 6 million spam tweets: A large ground truth for timely Twitter spam detection. In IEEE ICC.Google Scholar
Ying Chen, Yilu Zhou, Sencun Zhu, and Heng Xu. 2012. Detecting Offensive Language in Social Media to Protect Adolescent Online Safety. In PASSAT and SocialCom. Google ScholarDigital Library
Lucie Corcoran, Conor Mc Guckin, and Garry Prentice. 2015. Cyberbullying or Cyber Aggression? A Review of Existing Definitions of Cyber-Based Peer-toPeer Aggression. Societies 5, 2 (2015).Google Scholar
CrowdFlower. 2017. (2017). crowdflower.com.Google Scholar
Maral Dadvar, Dolf Trieschnigg, and Franciska Jong. 2014. Experts and machines against bullies: A hybrid approach to detect cyberbullies. In Canadian AI.Google Scholar
Jesse Davis and Mark Goadrich. 2006. The relationship between Precision-Recall and ROC curves. In Machine learning. Google ScholarDigital Library
Karthik Dinakar, Roi Reichart, and Henry Lieberman. 2011. Modeling the detection of Textual Cyberbullying. The Social Mobile Web 11 (2011).Google Scholar
Nemanja Djuric, Jing Zhou, Robin Morris, Mihajlo Grbovic, Vladan Radosavljevic, and Narayan Bhamidipati. 2015. Hate Speech Detection with Comment Embeddings. In WWW. Google ScholarDigital Library
Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Christos Faloutsos, and Athena Vakali. 2015. Reteeting Activity on Twitter: Signs of Deception. In PAKDD.Google Scholar
Dorothy Wunmi Grigg. 2010. Cyber-aggression: Definition and concept of cyberbullying. Australian Journal of Guidance and Counselling 20, 02 (2010).Google ScholarCross Ref
Laura D. Hanish, Becky Kochenderfer-Ladd, Richard A. Fabes, Carol Lynn Martin, Donna Denning, and others. 2004. Bullying among young children: The influence of peers and teachers. Bullying in American schools: A social-ecological perspective on prevention and intervention (2004).Google Scholar
James A. Hanley and Barbara J. McNeil. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 1 (1982).Google Scholar
Hatebase database. 2017. (2017). https://www.hatebase.org/.Google Scholar
Gabriel Emile Hine, Jeremiah Onaolapo, Emiliano De Cristofaro, Nicolas Kourtellis, Ilias Leontiadis, Riginos Samaras, Gianluca Stringhini, and Jeremy Blackburn. 2017. Kek, Cucks, and God Emperor Trump: A Measurement Study of 4chan's Politically Incorrect Forum and Its Effects on the Web. In ICWSM.Google Scholar
Homa Hosseinmardi, Richard Han, Qin Lv, Shivakant Mishra, and Amir Ghasemianlangroodi. 2014. Towards understanding cyberbullying behavior in a semi-anonymous social network. In IEEE/ACM ASONAM.Google Scholar
Homa Hosseinmardi, Sabrina Arredondo Mattson, Rahat Ibn Rafiq, Richard Han, Qin Lv, and Shivakant Mishra. 2015. Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network. In In SocInfo.Google Scholar
Fang Jin, Edward Dougherty, Parang Saraf, Yang Cao, and Naren Ramakrishnan. 2013. Epidemiological Modeling of News and Rumors on Twitter. In SNAKDD. Google ScholarDigital Library
Ji-Hyun, K. 2009. Estimating Classification Error Rate: Repeated Cross-validation, Repeated Hold-out and Bootstrap. Comput. Stat. Data Anal. 53, 11 (2009). Google ScholarDigital Library
Imrul Kayes, Nicolas Kourtellis, Daniele Quercia, Adriana Iamnitchi, and Francesco Bonchi. 2015. The Social World of Content Abusers in Community Question Answering. In WWW. Google ScholarDigital Library
Jon M. Kleinberg. 1999. Hubs, Authorities, and Communities. Comput. Surveys 31, 4es, Article 5 (1999). Google ScholarDigital Library
A. Massanari. 2015. #Gamergate and The Fappening: How Reddit's algorithm, governance, and culture support toxic technocultures. New Media & Society (2015).Google Scholar
Lucy McMahon. 2000. Bullying and harassment in the workplace. International Journal of Contemporary Hospitality Management 12, 6 (2000).Google ScholarCross Ref
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).Google Scholar
Meagan Miller. 2016. goo.gl/n1W6nt. (4 Oct 2016).Google Scholar
Vinita Nahar, Sayan Unankard, Xue Li, and Chaoyi Pang. 2012. Sentiment Analysis for Effective Detection of Cyber Bullying. In APWeb. Google ScholarDigital Library
Gonzalo Navarro. 2001. A Guided Tour to Approximate String Matching. Comput. Surveys 33, 1 (2001). Google ScholarDigital Library
Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. 2016. Abusive Language Detection in Online User Content. In WWW. Google ScholarDigital Library
List of Swear Words & Curse Words. 2017. (2017). goo.gl/ur1Ind.Google Scholar
Jerrad Arthur Patch. 2015. Detecting bullying on Twitter using emotion lexicons.Google Scholar
J. Pfeffer, T. Zorbach, and K. M. Carley. 2014. Understanding online firestorms: Negative word-of-mouth dynamics in social media networks. Journal of Marketing Communications 20, 1--2 (2014).Google ScholarCross Ref
Pham, Sherisse. 2017. Twitter tries new measures in crackdown on harassment. CNNtech. (7 February 2017). goo.gl/nMi4ZQ.Google Scholar
Stephanie Pieschl, Torsten Porsch, Tobias Kahl, and Rahel Klockenbusch. 2013. Relevant dimensions of cyberbullying - Results from two experimental studies. Journal of Applied Developmental Psychology 34, 5 (2013).Google ScholarCross Ref
J. R. Quinlan. 1986. Induction of Decision Trees. Machine Learning 1, 1 (1986). Google ScholarDigital Library
Rozsa, Matthew 2016. Twitter trolls are now abusing the company's bottom line. goo.gl/SryS3k. (2016).Google Scholar
A. Saravanaraj, J. I. Sheeba, and S. Pradeep Devaneyan. 2016. Automatic Detection of Cyberbullying from Twitter. IJCSITS 6 (2016).Google Scholar
SentiStrength. 2017. http://sentistrength.wlv.ac.uk/. (2017).Google Scholar
P. K. Smith, J. Mahdavi, M. Carvalho, S. Fisher, S. Russell, and N. Tippett. 2008. Cyberbullying: Its nature and impact in secondary school pupils. In Child Psychology and Psychiatry.Google Scholar
stopbullying.gov. 2014. Facts About Bullying. (2014). goo.gl/in5JJB.Google Scholar
Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. 2010. Detecting spammers on social networks. In ACSAC. Google ScholarDigital Library
Aatif Sulleyman. 2017. Twitter temporarily limiting users for abusive behaviour. Independent. (16 February 2017). goo.gl/yfJrZn.Google Scholar
The Guardian. 2015. Twitter CEO: We suck at dealing with trolls and abuse. goo.gl/6CxnwP. (2015).Google Scholar
The Guardian. 2016. Did trolls cost Twitter 3.5bn and its sale goo.gl/2IdA5W. (2016).Google Scholar
Robert S. Tokunaga. 2010. Review: Following You Home from School: A Critical Review and Synthesis of Research on Cyberbullying Victimization. Computers in Human Behavior 26, 3 (2010). Google ScholarDigital Library
UMICH SI650 - Sentiment Classification. 2011. https://inclass.kaggle.com/c/si650winter11. (15 Apr 2011).Google Scholar
Cynthia Van Hee, Els Lefever, Ben Verhoeven, Julie Mennes, Bart Desmet, Guy De Pauw, Walter Daelemans, and Véronique Hoste. 2015. Automatic detection and prevention of cyberbullying. In Human and Social Analytics.Google Scholar
A. H. Wang. 2010. Don't follow me: Spam detection in Twitter. In SECRYPT.Google Scholar
Jun-Ming Xu, Xiaojin Zhu, and Amy Bellmore. 2012. Fast Learning for Sentiment Analysis on Bullying. In WISDOM. Google ScholarDigital Library

Index Terms

Mean Birds: Detecting Aggression and Bullying on Twitter

Recommendations

Cybergossip and Problematic Internet Use in cyberaggression and cybervictimisation among adolescents
Abstract
Research on cyberbullying has focused on personal and contextual factors. However, little is known about its relationship with habitual behaviours associated with easy access to the Internet, such as cybergossip and problematic ...
Highlights
- Cyberbullying is associated with problematic Internet use (PIU) and cybergossip.
Read More
Pathological narcissism, cyberbullying victimization and offending among homosexual and heterosexual participants in online dating websites

Homosexual individuals are exposed to high levels of victimization. However, no studies have examined personality risk factors for cyberbullying victimization and offending among this population. This study investigated the relationships between ...
Read More
Parental mediation, cyberbullying, and cybertrolling

Researchers are concerned with identifying the risk and protective factors associated with adolescents' involvement in cyberharassment. One such factor is parental mediation of children's electronic technology use. Little attention has been given to how ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WebSci '17: Proceedings of the 2017 ACM on Web Science Conference
June 2017
438 pages
ISBN:9781450348966
DOI:10.1145/3091478
Conference Chairs:
Peter Fox
Rensselaer Polytechnic Institute, USA
,
Deborah McGuinness
Rensselaer Polytechnic Institute, USA
,
Lindsay Poirer
Rensselaer Polytechnic Institute, USA
,
Program Chairs:
Paolo Boldi
Universita degli Studi di Milano, Italy
,
Katharina Kinder-Kurlanda
GESIS - Leibniz Institute for the Social Sciences, Germany
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 June 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cyberaggression
cyberbullying
twitter
Qualifiers
- research-article
Conference

Acceptance Rates
WebSci '17 Paper Acceptance Rate30of85submissions,35%Overall Acceptance Rate218of875submissions,25%
More
Upcoming Conference
Websci '24

Sponsor:

sigweb

16th ACM Web Science Conference

May 21 - 24, 2024

Stuttgart , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 192
  Total Citations
  View Citations
- 2,021
  Total Downloads
- Downloads (Last 12 months)157
- Downloads (Last 6 weeks)23
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Mean Birds: Detecting Aggression and Bullying on Twitter

WebSci '17: Proceedings of the 2017 ACM on Web Science Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Cybergossip and Problematic Internet Use in cyberaggression and cybervictimisation among adolescents

Pathological narcissism, cyberbullying victimization and offending among homosexual and heterosexual participants in online dating websites

Parental mediation, cyberbullying, and cybertrolling