research-article

Privacy-aware spam detection in social bookmarking systems

Authors:
Beate Navarro Bullock

University of Kassel, Kassel, Germany

University of Kassel, Kassel, Germany
View Profile

,
Hana Lerch

University of Kassel, Design Kassel, Germany

University of Kassel, Design Kassel, Germany
View Profile

,
Alexander Roßnagel

University of Kassel, Design Kassel, Germany

University of Kassel, Design Kassel, Germany
View Profile

,
Andreas Hotho

University of Würzburg, Wiirzburg, Germany

University of Würzburg, Wiirzburg, Germany
View Profile

,
Gerd Stumme

University of Kassel, Kassel, Germany

University of Kassel, Kassel, Germany
View Profile

i-KNOW '11: Proceedings of the 11th International Conference on Knowledge Management and Knowledge TechnologiesSeptember 2011Article No.: 15Pages 1–8https://doi.org/10.1145/2024288.2024306

Published:07 September 2011Publication History

i-KNOW '11: Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies

Pages 1–8

ABSTRACT

With the increased popularity of Web 2.0 services in the last years data privacy has become a major concern for users. The more personal data users reveal, the more difficult it becomes to control its disclosure in the web. However, for Web 2.0 service providers, the data provided by users is a valuable source for offering effective, personalised data mining services. One major application is the detection of spam in social bookmarking systems: in order to prevent a decrease of content quality, providers need to distinguish spammers and exclude them from the system. They thereby experience a conflict of interests: on the one hand, they need to identify spammers based on the information they collect about users, on the other hand, they need to respect privacy concerns and process as few personal data as possible. It would therefore be of tremendous help for system developers and users to know which personal data are needed for spam detection and which can be ignored. In this paper we address these questions by presenting a data privacy aware feature engineering approach. It consists of the design of features for spam classification which are evaluated according to both, performance and privacy conditions. Experiments using data from the social bookmarking system BibSonomy show that both conditions must not exclude each other.

References

K. Barker, M. Askari, M. Banerjee, K. Ghazinour, B. Mackas, M. Majedi, S. Pun, and A. Williams. A data privacy taxonomy. In Proc, of the 26th British National Conference on Databases: Dataspace: The Final Frontier, BNCOD 26, pages 42--54, Berlin, Heidelberg, 2009. Springer-Verlag. Google ScholarDigital Library
S. Bhagat, G. Cormode, B. Krishnamurthy, and D. Srivastava. Class-based graph anonymization for social network data. Proc. VLDB Endow., 2:766--777, August 2009. Google ScholarDigital Library
C. Cattuto, C. Schmitz, A. Baldassarri, V. D. P. Servedio, V. Loreto, A. Hotho, M. Grahl, and G. Stumme. Network properties of folksonomies. Al Communications Journal, 20(4):245--262, 2007. Google ScholarDigital Library
F. Chen, P.-N. Tan, and A. K. Jain. A co-classification framework for detecting web spam and spammers in social media web sites. In D. W.-L. Cheung, I.-Y. Song, W. W. Chu, X. Hu, and J. J. Lin, editors, CIKM, pages 1807--1810. ACM, 2009. Google ScholarDigital Library
K. Cornelius and S. Tschoepe. Strafrechtliche Grenzen der zentralen E-Mail-Filterung und -Blockade. Kommunikation und Recht, pages 269--271, 2006.Google Scholar
Council of Europe. Convention for the protection of individuals with regard to automatic processing of personal data, January 1981.Google Scholar
G. Danezis. Inferring privacy policies for social networking services. In Proc, of the 2nd ACM workshop on Security and artificial intelligence, AlSec '09, pages 5--10, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
P. V. Eecke and M. Truyens. Privacy and social networks. Computer Law & Security Review, 26(5):535--546, 2010.Google ScholarCross Ref
L. Fang and K. LeFevre. Privacy wizards for social networking sites. In Proc, of the 19th international conference on World wide web, WWW '10, pages 351--360, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
T. Fawcett. An introduction to roc analysis. Pattern Recogn. Lett, 27(8):861--874, 2006. Google ScholarDigital Library
S. Golder and B. A. Huberman. The structure of collaborative tagging systems. Journal of Information Sciences, 32(2):198--208, April 2006. Google ScholarDigital Library
P. Heymann, G. Koutrika, and H. Garcia-Molina. Fighting spam on social web sites: A survey of approaches and future challenges. IEEE Internet Computing, 11:36--45, November 2007. Google ScholarDigital Library
T. Hoeren. Intemetrecht, 2010. P. 419 et seq. Available at: http://www.uni-muenster.de/Jura.itm/hoeren/materialien/Skript/Skript\_Internetrecht\_September\y.202010.pdf.Google Scholar
A. Hotho, D. Benz, R. Jäschke, and B. Krause, editors. EC ML PKDD Discovery Challenge 2008 (RSDC'08). Workshop at 18th Europ. Conf. on Machine Learning (ECML'08)/11th Europ. Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD'08), 2008.Google Scholar
B. Krause, H. Lerch, A. Hotho, A. Roßnagel, and G. Stumme. Datenschutz im Web 2.0 am Beispiel des sozialen Tagging-Systems BibSonomy. Informatik-Spektrum, pages 1--12, 2010.Google Scholar
B. Krause, C. Schmitz, A. Hotho, and G. Stumme. The anti-social tagger: detecting spam in social bookmarking systems. In Proc, of the 4th international workshop on Adversarial information retrieval on the web, pages 61--68, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
B. Krishnamurthy and C. E. Wills. Characterizing privacy in online social networks. In Proc, of the first workshop on Online social networks, WOSP '08, pages 37--42, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
S. Leible. Spam oder Nicht-Spam, das ist hier die Frage. Kommunikation und Recht, 11:485--489, 2006.Google Scholar
H. Lerch, B. Krause, A. Hotho, A. Rofinagel, and G. Stumme. Social Bookmarking-Systeme --- die unerkannten Datensammler - Ungewollte personenbezogene Datenverabeitung? MultiMedia und Recht, 7:454--458, 2010.Google Scholar
B. Markines, C. Cattuto, and F. Menczer. Social spam detection. In D. Fetterly and Z. Gyöngyi, editors, AIRWeb, ACM International Conference Proceeding Series, pages 41--48, 2009. Google ScholarDigital Library
OLG Frankfurt a.M. Judgement from 16 June 2010, June 2010. 13 U 105/07.Google Scholar
C. Prasse. Spam-E-Mails in der neueren Rechtsprechung. Monatsschrift fuer deutsches Recht, 7:361--365, 2006.Google Scholar
J. Schrammel, C. Köffel, and M. Tscheligi. How much do you tell?: information disclosure behaviour indifferent types of online communities. In Proc, of the 4th international conference on Communities and technologies, pages 275--284, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
J. Schrammel, C. Köffel, and M. Tscheligi. Personality traits, usage patterns and information disclosure in online communities. In Proc, of the 23rd British HCI Group Annual Conference on People and Computers: Celebrating People and Technology, pages 169--174, Swinton, UK, 2009. British Computer Society. Google ScholarDigital Library
G. Spindler and S. Ernst. Vertragsgestaltung für den Einsatz von E-Mail-Filtern. Computer Und Recht: Forum für die Praxis des Rechts der Datenverarbeitung, Information und Automation, 20(6):437--444, 2004.Google Scholar
T. Stadler. Schutz vor Spam durch Greylisting - Eine rechtsadaequate Handlungsoption? Datenschutz und Datensicherheit, 6:433--438, 2005.Google Scholar
The Madrid Resolution. International standards on the protection of personal data and privacy. In 31st International Conference of Data Protection and Privacy Commissioners, volume 2, November 2009.Google Scholar
UN General Assembly. Guidelines for the regulation of computerized personal data files. Available at:http://www.unhcr.org/refworld/docid/3ddcafaac.html, December 1990.Google Scholar
B. Zhou, J. Pei, and W. Luk. A brief survey on anonymization techniques for privacy preserving publishing of social network data. SIGKDD Explor. Newsl., 10:12--22, December 2008. Google ScholarDigital Library

Index Terms

Privacy-aware spam detection in social bookmarking systems

Recommendations

Post-Level Spam Detection for Social Bookmarking Web Sites
ASONAM '11: Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining

Social book marking Web sites have emerged recently for collecting and sharing of interesting Web sites among users. People can add Web pages to such sites as bookmarks and allow themselves as well as others to manipulate them. One of the key features ...
Read More
Personalized privacy-preserving social recommendation
AAAI'18/IAAI'18/EAAI'18: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence

Privacy leakage is an important issue for social recommendation. Existing privacy preserving social recommendation approaches usually allow the recommender to fully control users' information. This may be problematic since the recommender itself may be ...
Read More
Spam Detection: Technologies for spam detection

The underlying problem with spam detection is how to define spam. Simon Heron of Network Box examines current techniques for defining and detecting spam and how spamming itself has evolved in order to avoid detection. From early whitelisting and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
i-KNOW '11: Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies
September 2011
306 pages
ISBN:9781450307321
DOI:10.1145/2024288
Editors:
Stefanie Lindstaedt
Know-Center Graz, Austria
,
Michael Granitzer
Know-Center Graz, Austria
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 September 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
privacy social-bookmarking spam-detection
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate77of238submissions,32%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 335
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Privacy-aware spam detection in social bookmarking systems

i-KNOW '11: Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies

ABSTRACT

References

Cited By

Index Terms

Recommendations

Post-Level Spam Detection for Social Bookmarking Web Sites

Personalized privacy-preserving social recommendation

Spam Detection: Technologies for spam detection