skip to main content
10.1145/3091478.3091487acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article

Mean Birds: Detecting Aggression and Bullying on Twitter

Published:25 June 2017Publication History

ABSTRACT

In recent years, bullying and aggression against social media users have grown significantly, causing serious consequences to victims of all demographics. Nowadays, cyberbullying affects more than half of young social media users worldwide, suffering from prolonged and/or coordinated digital harassment. Also, tools and technologies geared to understand and mitigate it are scarce and mostly ineffective. In this paper, we present a principled and scalable approach to detect bullying and aggressive behavior on Twitter. We propose a robust methodology for extracting text, user, and network-based attributes, studying the properties of bullies and aggressors, and what features distinguish them from regular users. We find that bullies post less, participate in fewer online communities, and are less popular than normal users. Aggressors are relatively popular and tend to include more negativity in their posts. We evaluate our methodology using a corpus of 1.6M tweets posted over 3 months, and show that machine learning classification algorithms can accurately detect users exhibiting bullying and aggressive behavior, with over 90% AUC.

References

  1. About suspended accounts. 2017. (2017). goo.gl/asJgpq.Google ScholarGoogle Scholar
  2. Jeremy Blackburn, Ramanuja Simha, Nicolas Kourtellis, Xiang Zuo, Matei Ripeanu, John Skvoretz, and Adriana Iamnitchi. 2012. Branded with a scarlet "C": cheaters in a gaming social network. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre. 2011. The Louvain method for community detection in large networks. Statistical Mechanics: Theory and Experiment 10 (2011).Google ScholarGoogle Scholar
  4. Cyberbullying Research Center. 2016. (26 November 2016). goo.gl/7kzSY0.Google ScholarGoogle Scholar
  5. Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, and Athena Vakali. 2017. Hate is not Binary: Studying Abusive Behavior of #GamerGate on Twitter. In ACM Hypertext. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, and Athena Vakali. 2017. Measuring #GamerGate: A Tale of Hate, Sexism, and Bullying. In WWW CyberSafety Workshop. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Despoina Chatzakou, Vassiliki Koutsonikola, Athena Vakali, and Konstantinos Kafetsios. 2013. Micro-blogging Content Analysis via Emotionally-Driven Clustering. In ACII. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Artificial Intelligence Research 16, 1 (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chao Chen, Andy Liaw, and Leo Breiman. 2004. Using random forest to learn imbalanced data. University of California, Berkeley (2004).Google ScholarGoogle Scholar
  10. C. Chen, J. Zhang, X. Chen, Y. Xiang, and W. Zhou. 2015. 6 million spam tweets: A large ground truth for timely Twitter spam detection. In IEEE ICC.Google ScholarGoogle Scholar
  11. Ying Chen, Yilu Zhou, Sencun Zhu, and Heng Xu. 2012. Detecting Offensive Language in Social Media to Protect Adolescent Online Safety. In PASSAT and SocialCom. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Lucie Corcoran, Conor Mc Guckin, and Garry Prentice. 2015. Cyberbullying or Cyber Aggression? A Review of Existing Definitions of Cyber-Based Peer-toPeer Aggression. Societies 5, 2 (2015).Google ScholarGoogle Scholar
  13. CrowdFlower. 2017. (2017). crowdflower.com.Google ScholarGoogle Scholar
  14. Maral Dadvar, Dolf Trieschnigg, and Franciska Jong. 2014. Experts and machines against bullies: A hybrid approach to detect cyberbullies. In Canadian AI.Google ScholarGoogle Scholar
  15. Jesse Davis and Mark Goadrich. 2006. The relationship between Precision-Recall and ROC curves. In Machine learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Karthik Dinakar, Roi Reichart, and Henry Lieberman. 2011. Modeling the detection of Textual Cyberbullying. The Social Mobile Web 11 (2011).Google ScholarGoogle Scholar
  17. Nemanja Djuric, Jing Zhou, Robin Morris, Mihajlo Grbovic, Vladan Radosavljevic, and Narayan Bhamidipati. 2015. Hate Speech Detection with Comment Embeddings. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Christos Faloutsos, and Athena Vakali. 2015. Reteeting Activity on Twitter: Signs of Deception. In PAKDD.Google ScholarGoogle Scholar
  19. Dorothy Wunmi Grigg. 2010. Cyber-aggression: Definition and concept of cyberbullying. Australian Journal of Guidance and Counselling 20, 02 (2010).Google ScholarGoogle ScholarCross RefCross Ref
  20. Laura D. Hanish, Becky Kochenderfer-Ladd, Richard A. Fabes, Carol Lynn Martin, Donna Denning, and others. 2004. Bullying among young children: The influence of peers and teachers. Bullying in American schools: A social-ecological perspective on prevention and intervention (2004).Google ScholarGoogle Scholar
  21. James A. Hanley and Barbara J. McNeil. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 1 (1982).Google ScholarGoogle Scholar
  22. Hatebase database. 2017. (2017). https://www.hatebase.org/.Google ScholarGoogle Scholar
  23. Gabriel Emile Hine, Jeremiah Onaolapo, Emiliano De Cristofaro, Nicolas Kourtellis, Ilias Leontiadis, Riginos Samaras, Gianluca Stringhini, and Jeremy Blackburn. 2017. Kek, Cucks, and God Emperor Trump: A Measurement Study of 4chan's Politically Incorrect Forum and Its Effects on the Web. In ICWSM.Google ScholarGoogle Scholar
  24. Homa Hosseinmardi, Richard Han, Qin Lv, Shivakant Mishra, and Amir Ghasemianlangroodi. 2014. Towards understanding cyberbullying behavior in a semi-anonymous social network. In IEEE/ACM ASONAM.Google ScholarGoogle Scholar
  25. Homa Hosseinmardi, Sabrina Arredondo Mattson, Rahat Ibn Rafiq, Richard Han, Qin Lv, and Shivakant Mishra. 2015. Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network. In In SocInfo.Google ScholarGoogle Scholar
  26. Fang Jin, Edward Dougherty, Parang Saraf, Yang Cao, and Naren Ramakrishnan. 2013. Epidemiological Modeling of News and Rumors on Twitter. In SNAKDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ji-Hyun, K. 2009. Estimating Classification Error Rate: Repeated Cross-validation, Repeated Hold-out and Bootstrap. Comput. Stat. Data Anal. 53, 11 (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Imrul Kayes, Nicolas Kourtellis, Daniele Quercia, Adriana Iamnitchi, and Francesco Bonchi. 2015. The Social World of Content Abusers in Community Question Answering. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jon M. Kleinberg. 1999. Hubs, Authorities, and Communities. Comput. Surveys 31, 4es, Article 5 (1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Massanari. 2015. #Gamergate and The Fappening: How Reddit's algorithm, governance, and culture support toxic technocultures. New Media & Society (2015).Google ScholarGoogle Scholar
  31. Lucy McMahon. 2000. Bullying and harassment in the workplace. International Journal of Contemporary Hospitality Management 12, 6 (2000).Google ScholarGoogle ScholarCross RefCross Ref
  32. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).Google ScholarGoogle Scholar
  33. Meagan Miller. 2016. goo.gl/n1W6nt. (4 Oct 2016).Google ScholarGoogle Scholar
  34. Vinita Nahar, Sayan Unankard, Xue Li, and Chaoyi Pang. 2012. Sentiment Analysis for Effective Detection of Cyber Bullying. In APWeb. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Gonzalo Navarro. 2001. A Guided Tour to Approximate String Matching. Comput. Surveys 33, 1 (2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. 2016. Abusive Language Detection in Online User Content. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. List of Swear Words & Curse Words. 2017. (2017). goo.gl/ur1Ind.Google ScholarGoogle Scholar
  38. Jerrad Arthur Patch. 2015. Detecting bullying on Twitter using emotion lexicons.Google ScholarGoogle Scholar
  39. J. Pfeffer, T. Zorbach, and K. M. Carley. 2014. Understanding online firestorms: Negative word-of-mouth dynamics in social media networks. Journal of Marketing Communications 20, 1--2 (2014).Google ScholarGoogle ScholarCross RefCross Ref
  40. Pham, Sherisse. 2017. Twitter tries new measures in crackdown on harassment. CNNtech. (7 February 2017). goo.gl/nMi4ZQ.Google ScholarGoogle Scholar
  41. Stephanie Pieschl, Torsten Porsch, Tobias Kahl, and Rahel Klockenbusch. 2013. Relevant dimensions of cyberbullying - Results from two experimental studies. Journal of Applied Developmental Psychology 34, 5 (2013).Google ScholarGoogle ScholarCross RefCross Ref
  42. J. R. Quinlan. 1986. Induction of Decision Trees. Machine Learning 1, 1 (1986). Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Rozsa, Matthew 2016. Twitter trolls are now abusing the company's bottom line. goo.gl/SryS3k. (2016).Google ScholarGoogle Scholar
  44. A. Saravanaraj, J. I. Sheeba, and S. Pradeep Devaneyan. 2016. Automatic Detection of Cyberbullying from Twitter. IJCSITS 6 (2016).Google ScholarGoogle Scholar
  45. SentiStrength. 2017. http://sentistrength.wlv.ac.uk/. (2017).Google ScholarGoogle Scholar
  46. P. K. Smith, J. Mahdavi, M. Carvalho, S. Fisher, S. Russell, and N. Tippett. 2008. Cyberbullying: Its nature and impact in secondary school pupils. In Child Psychology and Psychiatry.Google ScholarGoogle Scholar
  47. stopbullying.gov. 2014. Facts About Bullying. (2014). goo.gl/in5JJB.Google ScholarGoogle Scholar
  48. Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. 2010. Detecting spammers on social networks. In ACSAC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Aatif Sulleyman. 2017. Twitter temporarily limiting users for abusive behaviour. Independent. (16 February 2017). goo.gl/yfJrZn.Google ScholarGoogle Scholar
  50. The Guardian. 2015. Twitter CEO: We suck at dealing with trolls and abuse. goo.gl/6CxnwP. (2015).Google ScholarGoogle Scholar
  51. The Guardian. 2016. Did trolls cost Twitter 3.5bn and its sale goo.gl/2IdA5W. (2016).Google ScholarGoogle Scholar
  52. Robert S. Tokunaga. 2010. Review: Following You Home from School: A Critical Review and Synthesis of Research on Cyberbullying Victimization. Computers in Human Behavior 26, 3 (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. UMICH SI650 - Sentiment Classification. 2011. https://inclass.kaggle.com/c/si650winter11. (15 Apr 2011).Google ScholarGoogle Scholar
  54. Cynthia Van Hee, Els Lefever, Ben Verhoeven, Julie Mennes, Bart Desmet, Guy De Pauw, Walter Daelemans, and Véronique Hoste. 2015. Automatic detection and prevention of cyberbullying. In Human and Social Analytics.Google ScholarGoogle Scholar
  55. A. H. Wang. 2010. Don't follow me: Spam detection in Twitter. In SECRYPT.Google ScholarGoogle Scholar
  56. Jun-Ming Xu, Xiaojin Zhu, and Amy Bellmore. 2012. Fast Learning for Sentiment Analysis on Bullying. In WISDOM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Mean Birds: Detecting Aggression and Bullying on Twitter

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image ACM Conferences
                  WebSci '17: Proceedings of the 2017 ACM on Web Science Conference
                  June 2017
                  438 pages
                  ISBN:9781450348966
                  DOI:10.1145/3091478

                  Copyright © 2017 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 25 June 2017

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article

                  Acceptance Rates

                  WebSci '17 Paper Acceptance Rate30of85submissions,35%Overall Acceptance Rate218of875submissions,25%

                  Upcoming Conference

                  Websci '24
                  16th ACM Web Science Conference
                  May 21 - 24, 2024
                  Stuttgart , Germany

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader