ABSTRACT
With an abundance of social network data being released, the need to protect sensitive information within these networks has become an important concern of data publishers. One of privacy preserving approaches often used is anonymization. This paper focuses on the popular notion of k-anonymization, where k is the required threshold of structural anonymity. Given a social network graph, it transforms G to G', such that structural property of each node in G' is attained by at least k --- 1 other nodes in G'. The nodes are clustered together into supernodes of size at least k. The above being NP-hard optimization problem, a genetic algorithm is proposed to optimize it's structural k-anonymity. Edge generalization is then employed, based on their relationships to achieve indistinguishable nodes.
- G. Aggarwal, N. Mishra, and B. P. Mishra. Secure computation of the kth-ranked element. In In Proceedings of the International Conference on the Theory and Applications of Cryptographic Techniques, 2004.Google Scholar
- R. Agrawal and R. Srikant. Privacy-preserving data mining. In In Proceedings of the ACM-SIGMOD International Conference on Management of Data, 2000. Google ScholarDigital Library
- L. Backstrom, C. Dwork, and J. Kleinberg. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In Proceedings of the 16th international conference on World Wide Web, pages 181--190. ACM, 2007. Google ScholarDigital Library
- A. Bluma, C. Dwork, F. Ncsherry, and K. Nissim. Practical privacy: The sulq framework. In PODS 05 Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, 2005. Google ScholarDigital Library
- A. Campan and T. Truta. Data and structural k-anonymity in social networks. In Privacy, Security, and Trust in KDD, pages 33--54. Springer Berlin/Heidelberg, 2009. Google ScholarDigital Library
- R. Cilibrasi and P. Vitanyi. Clustering by compression. IEEE Trans. Information Theory, 51(4):1523--1545, 2005. Google ScholarDigital Library
- R. Cilibrasi and P. Vitanyi. A new quartet tree heuristic for hierarchical clustering. In Dagstuhl Seminar Proc.: Theory of Evolutionary Algorithms, 2006.Google Scholar
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. MIT Press, 2009. Google ScholarDigital Library
- T. Dalenius. Towards a methodology for statistical disclosure control. Statistik Tidskrift, 15:429--444, 1977.Google Scholar
- A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In In Proceedings of the ACM-SIGMOD Symposium on Principles of Database Systems (PODS03), pages 211--222, 2003. Google ScholarDigital Library
- F. Glover and G. Kochenberger. Handbook of Metaheuristics. Kluwer Academic Publishers, 2003.Google ScholarCross Ref
- D. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, 1989. Google ScholarDigital Library
- S. Hanhijarvi, G. Garriga, and K. Puolamaki. Randomization techniques for graphs. In In Proceedings of the SIAM Conference on Data Mining (SDM), 2009.Google ScholarCross Ref
- M. Hay, G. Miklau, D. Jensen, P. Weis, and S. Srivastava. Anonymizing social networks. Computer Science Department Faculty Publication Series, pages 07--19, 2007.Google Scholar
- M. Hay, G. Miklau, D. Jensen, P. Weis, and S. Srivastava. Resisting structural re-identification in anonymized social networks. Proc. VLDB Endow., 1(1):102--114, 2008. Google ScholarDigital Library
- J. Holland. Adaptation in Natural and Articial Systems. University of Michigan Press, 1975.Google Scholar
- B. Kapferer. Strategy and transaction in an African factory. Manchester: Manchester University Press, 1972.Google Scholar
- S. Kirkpatrick, J. C. D. Gelatt, and M. Vecchi. Optimization by simulated annealing. Science, 220(4598):671--680, 1983.Google ScholarCross Ref
- J. Kleinberg, C. Papadimitriou, and P. Raghavan. Auditing boolean attributes. Computer and System Sciences, 66, 2003. Google ScholarDigital Library
- D. T. Lee and B. J. Schachter. Two algorithms for constructing a delaunay triangulation. International Journal of Parallel Programming, 9(3):219--242, 1980.Google Scholar
- K. Liu and E. Terzi. Towards identity anonymization on graphs. In In Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008. Google ScholarDigital Library
- J. MacRae. Direct factor analysis of sociometric data. 1960.Google Scholar
- A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 223--228. ACM, 2004. Google ScholarDigital Library
- C. Moler. Matlab user's guide, the mathworks inc. 1980.Google Scholar
- S. N. Sivanandam and S. N. Deepa. Introduction to Genetic Algorithms. Springer, 2008. Google ScholarDigital Library
- M. Srinivas and L. Patnaik. Genetic algorithms: A survey. Computer, 27:17--26, 1994. Google ScholarDigital Library
- L. Sweeney. k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):557--570, 2002. Google ScholarDigital Library
- W. Wu, Y. Xiao, W. Wang, Z. He, and Z. Wang. k-symmetry model for identity anonymization in social networks. In In Proceedings of the 13th International Conference on Extending Database Technology, 2010. Google ScholarDigital Library
- X. Ying and X. Wu. Randomizing social networks: A spectrum preserving approach. In In SIAM Conference on Data Mining (SDM08), pages 739--750, 2008.Google ScholarCross Ref
- X. Ying and X. Wu. Graph generation with prescribed feature constraints. In In SIAM Conference on Data Mining (SDM09), 2009.Google ScholarCross Ref
- X. Ying and X. Wu. On link privacy in randomizing social networks. In In Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD09), 2009. Google ScholarDigital Library
- E. Zheleva and L. Getoor. Preserving the privacy of sensitive relationships in graph data. In Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD, pages 153--171. Springer-Verlag, 2007. Google ScholarDigital Library
- B. Zhou and J. Pei. Preserving privacy in social networks against neighborhood attacks. In In Proceedings of the International Conference on Data Engineering (ICDE08), pages 506--515, 2008. Google ScholarDigital Library
Index Terms
- A clustering approach for structural k-anonymity in social networks using genetic algorithm
Recommendations
The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks
Recently, more and more social network data have been published in one way or another. Preserving privacy in publishing social network data becomes an important concern. With some local knowledge about individuals in a social network, an adversary may ...
Genetic algorithm-based clustering approach for k-anonymization
k-Anonymity has been widely adopted as a model for protecting public released microdata from individual identification. This model requires that each record must be identical to at least k - 1 other records in the anonymized dataset with respect to a ...
Preventing Identity Disclosure in Social Networks Using Intersected Node
Social networks like Facebook, Twitter, Pinterest etc. provide data of its users to the demanding organizations to better comprehend the quality of their potential clients. Publishing confidential data of social network users in its raw form raises ...
Comments