ABSTRACT
In recent years, bullying and aggression against social media users have grown significantly, causing serious consequences to victims of all demographics. Nowadays, cyberbullying affects more than half of young social media users worldwide, suffering from prolonged and/or coordinated digital harassment. Also, tools and technologies geared to understand and mitigate it are scarce and mostly ineffective. In this paper, we present a principled and scalable approach to detect bullying and aggressive behavior on Twitter. We propose a robust methodology for extracting text, user, and network-based attributes, studying the properties of bullies and aggressors, and what features distinguish them from regular users. We find that bullies post less, participate in fewer online communities, and are less popular than normal users. Aggressors are relatively popular and tend to include more negativity in their posts. We evaluate our methodology using a corpus of 1.6M tweets posted over 3 months, and show that machine learning classification algorithms can accurately detect users exhibiting bullying and aggressive behavior, with over 90% AUC.
- About suspended accounts. 2017. (2017). goo.gl/asJgpq.Google Scholar
- Jeremy Blackburn, Ramanuja Simha, Nicolas Kourtellis, Xiang Zuo, Matei Ripeanu, John Skvoretz, and Adriana Iamnitchi. 2012. Branded with a scarlet "C": cheaters in a gaming social network. In WWW. Google ScholarDigital Library
- V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre. 2011. The Louvain method for community detection in large networks. Statistical Mechanics: Theory and Experiment 10 (2011).Google Scholar
- Cyberbullying Research Center. 2016. (26 November 2016). goo.gl/7kzSY0.Google Scholar
- Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, and Athena Vakali. 2017. Hate is not Binary: Studying Abusive Behavior of #GamerGate on Twitter. In ACM Hypertext. Google ScholarDigital Library
- Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, and Athena Vakali. 2017. Measuring #GamerGate: A Tale of Hate, Sexism, and Bullying. In WWW CyberSafety Workshop. Google ScholarDigital Library
- Despoina Chatzakou, Vassiliki Koutsonikola, Athena Vakali, and Konstantinos Kafetsios. 2013. Micro-blogging Content Analysis via Emotionally-Driven Clustering. In ACII. Google ScholarDigital Library
- Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Artificial Intelligence Research 16, 1 (2002). Google ScholarDigital Library
- Chao Chen, Andy Liaw, and Leo Breiman. 2004. Using random forest to learn imbalanced data. University of California, Berkeley (2004).Google Scholar
- C. Chen, J. Zhang, X. Chen, Y. Xiang, and W. Zhou. 2015. 6 million spam tweets: A large ground truth for timely Twitter spam detection. In IEEE ICC.Google Scholar
- Ying Chen, Yilu Zhou, Sencun Zhu, and Heng Xu. 2012. Detecting Offensive Language in Social Media to Protect Adolescent Online Safety. In PASSAT and SocialCom. Google ScholarDigital Library
- Lucie Corcoran, Conor Mc Guckin, and Garry Prentice. 2015. Cyberbullying or Cyber Aggression? A Review of Existing Definitions of Cyber-Based Peer-toPeer Aggression. Societies 5, 2 (2015).Google Scholar
- CrowdFlower. 2017. (2017). crowdflower.com.Google Scholar
- Maral Dadvar, Dolf Trieschnigg, and Franciska Jong. 2014. Experts and machines against bullies: A hybrid approach to detect cyberbullies. In Canadian AI.Google Scholar
- Jesse Davis and Mark Goadrich. 2006. The relationship between Precision-Recall and ROC curves. In Machine learning. Google ScholarDigital Library
- Karthik Dinakar, Roi Reichart, and Henry Lieberman. 2011. Modeling the detection of Textual Cyberbullying. The Social Mobile Web 11 (2011).Google Scholar
- Nemanja Djuric, Jing Zhou, Robin Morris, Mihajlo Grbovic, Vladan Radosavljevic, and Narayan Bhamidipati. 2015. Hate Speech Detection with Comment Embeddings. In WWW. Google ScholarDigital Library
- Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Christos Faloutsos, and Athena Vakali. 2015. Reteeting Activity on Twitter: Signs of Deception. In PAKDD.Google Scholar
- Dorothy Wunmi Grigg. 2010. Cyber-aggression: Definition and concept of cyberbullying. Australian Journal of Guidance and Counselling 20, 02 (2010).Google ScholarCross Ref
- Laura D. Hanish, Becky Kochenderfer-Ladd, Richard A. Fabes, Carol Lynn Martin, Donna Denning, and others. 2004. Bullying among young children: The influence of peers and teachers. Bullying in American schools: A social-ecological perspective on prevention and intervention (2004).Google Scholar
- James A. Hanley and Barbara J. McNeil. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 1 (1982).Google Scholar
- Hatebase database. 2017. (2017). https://www.hatebase.org/.Google Scholar
- Gabriel Emile Hine, Jeremiah Onaolapo, Emiliano De Cristofaro, Nicolas Kourtellis, Ilias Leontiadis, Riginos Samaras, Gianluca Stringhini, and Jeremy Blackburn. 2017. Kek, Cucks, and God Emperor Trump: A Measurement Study of 4chan's Politically Incorrect Forum and Its Effects on the Web. In ICWSM.Google Scholar
- Homa Hosseinmardi, Richard Han, Qin Lv, Shivakant Mishra, and Amir Ghasemianlangroodi. 2014. Towards understanding cyberbullying behavior in a semi-anonymous social network. In IEEE/ACM ASONAM.Google Scholar
- Homa Hosseinmardi, Sabrina Arredondo Mattson, Rahat Ibn Rafiq, Richard Han, Qin Lv, and Shivakant Mishra. 2015. Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network. In In SocInfo.Google Scholar
- Fang Jin, Edward Dougherty, Parang Saraf, Yang Cao, and Naren Ramakrishnan. 2013. Epidemiological Modeling of News and Rumors on Twitter. In SNAKDD. Google ScholarDigital Library
- Ji-Hyun, K. 2009. Estimating Classification Error Rate: Repeated Cross-validation, Repeated Hold-out and Bootstrap. Comput. Stat. Data Anal. 53, 11 (2009). Google ScholarDigital Library
- Imrul Kayes, Nicolas Kourtellis, Daniele Quercia, Adriana Iamnitchi, and Francesco Bonchi. 2015. The Social World of Content Abusers in Community Question Answering. In WWW. Google ScholarDigital Library
- Jon M. Kleinberg. 1999. Hubs, Authorities, and Communities. Comput. Surveys 31, 4es, Article 5 (1999). Google ScholarDigital Library
- A. Massanari. 2015. #Gamergate and The Fappening: How Reddit's algorithm, governance, and culture support toxic technocultures. New Media & Society (2015).Google Scholar
- Lucy McMahon. 2000. Bullying and harassment in the workplace. International Journal of Contemporary Hospitality Management 12, 6 (2000).Google ScholarCross Ref
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).Google Scholar
- Meagan Miller. 2016. goo.gl/n1W6nt. (4 Oct 2016).Google Scholar
- Vinita Nahar, Sayan Unankard, Xue Li, and Chaoyi Pang. 2012. Sentiment Analysis for Effective Detection of Cyber Bullying. In APWeb. Google ScholarDigital Library
- Gonzalo Navarro. 2001. A Guided Tour to Approximate String Matching. Comput. Surveys 33, 1 (2001). Google ScholarDigital Library
- Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. 2016. Abusive Language Detection in Online User Content. In WWW. Google ScholarDigital Library
- List of Swear Words & Curse Words. 2017. (2017). goo.gl/ur1Ind.Google Scholar
- Jerrad Arthur Patch. 2015. Detecting bullying on Twitter using emotion lexicons.Google Scholar
- J. Pfeffer, T. Zorbach, and K. M. Carley. 2014. Understanding online firestorms: Negative word-of-mouth dynamics in social media networks. Journal of Marketing Communications 20, 1--2 (2014).Google ScholarCross Ref
- Pham, Sherisse. 2017. Twitter tries new measures in crackdown on harassment. CNNtech. (7 February 2017). goo.gl/nMi4ZQ.Google Scholar
- Stephanie Pieschl, Torsten Porsch, Tobias Kahl, and Rahel Klockenbusch. 2013. Relevant dimensions of cyberbullying - Results from two experimental studies. Journal of Applied Developmental Psychology 34, 5 (2013).Google ScholarCross Ref
- J. R. Quinlan. 1986. Induction of Decision Trees. Machine Learning 1, 1 (1986). Google ScholarDigital Library
- Rozsa, Matthew 2016. Twitter trolls are now abusing the company's bottom line. goo.gl/SryS3k. (2016).Google Scholar
- A. Saravanaraj, J. I. Sheeba, and S. Pradeep Devaneyan. 2016. Automatic Detection of Cyberbullying from Twitter. IJCSITS 6 (2016).Google Scholar
- SentiStrength. 2017. http://sentistrength.wlv.ac.uk/. (2017).Google Scholar
- P. K. Smith, J. Mahdavi, M. Carvalho, S. Fisher, S. Russell, and N. Tippett. 2008. Cyberbullying: Its nature and impact in secondary school pupils. In Child Psychology and Psychiatry.Google Scholar
- stopbullying.gov. 2014. Facts About Bullying. (2014). goo.gl/in5JJB.Google Scholar
- Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. 2010. Detecting spammers on social networks. In ACSAC. Google ScholarDigital Library
- Aatif Sulleyman. 2017. Twitter temporarily limiting users for abusive behaviour. Independent. (16 February 2017). goo.gl/yfJrZn.Google Scholar
- The Guardian. 2015. Twitter CEO: We suck at dealing with trolls and abuse. goo.gl/6CxnwP. (2015).Google Scholar
- The Guardian. 2016. Did trolls cost Twitter 3.5bn and its sale goo.gl/2IdA5W. (2016).Google Scholar
- Robert S. Tokunaga. 2010. Review: Following You Home from School: A Critical Review and Synthesis of Research on Cyberbullying Victimization. Computers in Human Behavior 26, 3 (2010). Google ScholarDigital Library
- UMICH SI650 - Sentiment Classification. 2011. https://inclass.kaggle.com/c/si650winter11. (15 Apr 2011).Google Scholar
- Cynthia Van Hee, Els Lefever, Ben Verhoeven, Julie Mennes, Bart Desmet, Guy De Pauw, Walter Daelemans, and Véronique Hoste. 2015. Automatic detection and prevention of cyberbullying. In Human and Social Analytics.Google Scholar
- A. H. Wang. 2010. Don't follow me: Spam detection in Twitter. In SECRYPT.Google Scholar
- Jun-Ming Xu, Xiaojin Zhu, and Amy Bellmore. 2012. Fast Learning for Sentiment Analysis on Bullying. In WISDOM. Google ScholarDigital Library
Index Terms
- Mean Birds: Detecting Aggression and Bullying on Twitter
Recommendations
Cybergossip and Problematic Internet Use in cyberaggression and cybervictimisation among adolescents
AbstractResearch on cyberbullying has focused on personal and contextual factors. However, little is known about its relationship with habitual behaviours associated with easy access to the Internet, such as cybergossip and problematic ...
Highlights- Cyberbullying is associated with problematic Internet use (PIU) and cybergossip.
Pathological narcissism, cyberbullying victimization and offending among homosexual and heterosexual participants in online dating websites
Homosexual individuals are exposed to high levels of victimization. However, no studies have examined personality risk factors for cyberbullying victimization and offending among this population. This study investigated the relationships between ...
Parental mediation, cyberbullying, and cybertrolling
Researchers are concerned with identifying the risk and protective factors associated with adolescents' involvement in cyberharassment. One such factor is parental mediation of children's electronic technology use. Little attention has been given to how ...
Comments