Abstract
Microblogging Web sites, such as Twitter and Sina Weibo, have become popular platforms for socializing and sharing information in recent years. Spammers have also discovered this new opportunity to unfairly overpower normal users with unsolicited content, namely social spams. Although it is intuitive for everyone to follow legitimate users, recent studies show that both legitimate users and spammers follow spammers for different reasons. Evidence of users seeking spammers on purpose is also observed. We regard this behavior as useful information for spammer detection. In this article, we approach the problem of spammer detection by leveraging the “carefulness” of users, which indicates how careful a user is when she is about to follow a potential spammer. We propose a framework to measure the carefulness and develop a supervised learning algorithm to estimate it based on known spammers and legitimate users. We illustrate how the robustness of the detection algorithms can be improved with aid of the proposed measure. Evaluation on two real datasets from Sina Weibo and Twitter with millions of users are performed, as well as an online test on Sina Weibo. The results show that our approach indeed captures the carefulness, and it is effective for detecting spammers. In addition, we find that our measure is also beneficial for other applications, such as link prediction.
- Lada A. Adamic and Eytan Adar. 2003. Friends and neighbors on the Web. Social Networks 25, 3, 211--230. Google ScholarCross Ref
- Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining. 635--644. Google ScholarDigital Library
- Fabrıcio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgılio Almeida. 2010. Detecting spammers on Twitter. In Proceedings of the 7th Annual Collaboration, Electronic Messaging, Anti-Abuse, and Spam Conference. 12.Google Scholar
- Yazan Boshmaf, Dionysios Logothetis, Georgos Siganos, Jorge Lería, Jose Lorenzo, Matei Ripeanu, and Konstantin Beznosov. 2015. Íntegro: Leveraging victim prediction for robust fake account detection in OSNs. In Proceedings of the 2015 Network and Distributed System Security Symposium. Google ScholarCross Ref
- P. O. Boykin and V. P. Roychowdhury. 2005. Leveraging social networks to fight spam. Computer 38, 4, 61--68. Google ScholarDigital Library
- Qiang Cao, Michael Sirivianos, Xiaowei Yang, and Tiago Pregueiro. 2012. Aiding the detection of fake accounts in large scale social online services. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. 15.Google ScholarDigital Library
- Paul-Alexandru Chirita, Jörg Diederich, and Wolfgang Nejdl. 2005. MailRank: Using ranking for spam detection. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management. 373--380. Google ScholarDigital Library
- George Danezis and Prateek Mittal. 2009. SybilInfer: Detecting sybil nodes using social networks. In Proceedings of the ISOC Network and Distributed System Security Symposium.Google Scholar
- Peng Gao, Neil Zhenqiang Gong, Sanjeev Kulkarni, Kurt Thomas, and Prateek Mittal. 2015. SybilFrame: A defense-in-depth framework for structure-based sybil detection. arXiv:1503.02985.Google Scholar
- Sheng Gao, Ludovic Denoyer, and Patrick Gallinari. 2011. Temporal link prediction by integrating content and structure information. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 1169--1174. Google ScholarDigital Library
- Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. 2012. Understanding and combating link farming in the Twitter social network. In Proceedings of the 21st International Conference on World Wide Web. 61--70. Google ScholarDigital Library
- Neil Zhenqiang Gong, Michael Frank, and Payal Mittal. 2014a. SybilBelief: A semi-supervised learning approach for structure-based sybil detection. IEEE Transactions on Information Forensics and Security 9, 6, 976--987. Google ScholarDigital Library
- Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang, Eui Chul Richard Shin, Emil Stefanov, Elaine Runting Shi, and Dawn Song. 2014b. Joint link prediction and attribute inference using a social-attribute network. ACM Transactions on Intelligent Systems and Technology 5, 2, Article No. 27. Google ScholarDigital Library
- Chris Grier, Kurt Thomas, Vern Paxson, and Michael Zhang. 2010. @Spam: The underground on 140 characters or less. In Proceedings of the 17th ACM Conference on Computer and Communications Security. 27--37. Google ScholarDigital Library
- Zoltán Gyöngyi, Hector Garcia-Molina, and Jan Pedersen. 2004. Combating Web spam with TrustRank. In Proceedings of the 30th International Conference on Very Large Data Bases. 576--587.Google ScholarDigital Library
- Paul Heymann, Georgia Koutrika, and Hector Garcia-Molina. 2007. Fighting spam on social Web sites: A survey of approaches and future challenges. IEEE Internet Computing 11, 6, 36--45. Google ScholarDigital Library
- John Hopcroft, Tiancheng Lou, and Jie Tang. 2011. Who will follow you back? Reciprocal relationship prediction. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 1137--1146.Google Scholar
- Xia Hu, Jiliang Tang, and Huan Liu. 2014. Leveraging knowledge across media for spammer detection in microblogging. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. 547--556. Google ScholarDigital Library
- Xia Hu, Jiliang Tang, Yanchao Zhang, and Huan Liu. 2013. Social spammer detection in microblogging. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 2633--2639.Google Scholar
- Junxian Huang, Yinglian Xie, Fang Yu, Qifa Ke, Martin Abadi, Eliot Gillum, and Z. Morley Mao. 2013. SocialWatch: Detection of online service abuse via large-scale social graphs. In Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer, and Communications Security. 143--148. Google ScholarDigital Library
- Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web. 591--600. Google ScholarDigital Library
- Kyumin Lee, James Caverlee, and Steve Webb. 2010. Uncovering social spammers: Social honeypots + machine learning. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 435--442. Google ScholarDigital Library
- Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In Proceedings of the 12th ACM International Conference on Knowledge Discovery and Data Mining. 631--636. Google ScholarDigital Library
- David Liben-Nowell and Jon Kleinberg. 2003. The link prediction problem for social networks. In Proceedings of the 12th International Conference on Information and Knowledge Management. 556--559. Google ScholarDigital Library
- Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. 2010. Detecting spammers on social networks. In Proceedings of the 26th Annual Computer Security Applications Conference. 1--9. Google ScholarDigital Library
- Kurt Thomas, Chris Grier, Dawn Song, and Vern Paxson. 2011. Suspended accounts in retrospect: An analysis of Twitter spam. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference. 243--258. Google ScholarDigital Library
- Binghui Wang, Le Zhang, and Neil Zhenqiang Gong. 2017. SybilSCAR: Sybil detection in online social networks via local rule based propagation. In Proceedings of the IEEE International Conference on Computer Communications.Google ScholarCross Ref
- Dashun Wang, Dino Pedreschi, Chaoming Song, Fosca Giannotti, and Albert-Laszlo Barabasi. 2011. Human mobility, social ties, and link prediction. In Proceedings of the 17th ACM International Conference on Knowledge Discovery and Data Mining. 1100--1108. Google ScholarDigital Library
- Jianshu Weng, Ee-Peng Lim, Jing Jiang, and Qi He. 2010. TwitterRank: Finding topic-sensitive influential Twitterers. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 261--270. Google ScholarDigital Library
- Q. Xu, E. W. Xiang, Q. Yang, J. Du, and J. Zhong. 2012. SMS spam detection using noncontent features. IEEE Intelligent Systems 27, 6, 44--51. Google ScholarDigital Library
- Jilong Xue, Zhi Yang, Xiaoyong Yang, Xiao Wang, Lijiang Chen, and Yafei Dai. 2013. VoteTrust: Leveraging friend invitation graph to defend against social network sybils. In Proceedings of the 32nd IEEE International Conference on Computer Communications. 2400--2408. Google ScholarCross Ref
- Lian Yan, Robert H. Dodier, Michael Mozer, and Richard H. Wolniewicz. 2003. Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic. In Proceedings of the 20th International Conference on Machine Learning. 848--855.Google Scholar
- Chao Yang, Robert Harkreader, Jialong Zhang, Seungwon Shin, and Guofei Gu. 2012. Analyzing spammers’ social networks for fun and profit: A case study of cyber criminal ecosystem on Twitter. In Proceedings of the 21st International Conference on World Wide Web. 71--80. Google ScholarDigital Library
- Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y. Zhao, and Yafei Dai. 2011. Uncovering social network sybils in the wild. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement. ACM, New York, NY, 259--268. Google ScholarDigital Library
- Sarita Yardi, Daniel Romero, and Grant Schoenebeck. 2009. Detecting spam in a Twitter network. First Monday 15, 1. Google ScholarCross Ref
- Haifeng Yu, Phillip B. Gibbons, Michael Kaminsky, and Feng Xiao. 2008. SybilLimit: A near-optimal social network defense against sybil attacks. In Proceedings of the IEEE Symposium on Security and Privacy. 3--17. Google ScholarDigital Library
- Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman. 2006. SybilGuard: Defending against sybil attacks via social networks. Computer Communication Review 36, 4, 267--278. Google ScholarDigital Library
- L. L. Yu, S. Asur, and B. A. Huberman. 2012. Artificial inflation: The real story of trends and trend-setters in Sina Weibo. In Proceedings of the International Conference on Privacy, Security, Risk, and Trust, and the International Conference on Social Computing. 514--519. Google ScholarDigital Library
- Yin Zhu, Xiao Wang, Erheng Zhong, Nathan Nan Liu, He Li, and Qiang Yang. 2012. Discovering spammers in social networks. In Proceedings of the 26th AAAI Conference on Artificial Intelligence.Google Scholar
Index Terms
- Robust Spammer Detection in Microblogs: Leveraging User Carefulness
Recommendations
Leveraging Careful Microblog Users for Spammer Detection
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide WebMicroblogging websites, e.g. Twitter and Sina Weibo, have become a popular platform for socializing and sharing information in recent years. Spammers have also discovered this new opportunity to unfairly overpower normal users with unsolicited content, ...
Discovering spammer communities in twitter
Online social networks have become immensely popular in recent years and have become the major sources for tracking the reverberation of events and news throughout the world. However, the diversity and popularity of online social networks attract ...
Leveraging knowledge across media for spammer detection in microblogging
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrievalWhile microblogging has emerged as an important information sharing and communication platform, it has also become a convenient venue for spammers to overwhelm other users with unwanted content. Currently, spammer detection in microblogging focuses on ...
Comments