ABSTRACT
DBSCAN is a well-known density-based clustering algorithm which offers advantages for finding clusters of arbitrary shapes compared to partitioning and hierarchical clustering methods. However, there are few papers studying the DBSCAN algorithm under the privacy preserving distributed data mining model, in which the data is distributed between two or more parties, and the parties cooperate to obtain the clustering results without revealing the data at the individual parties. In this paper, we address the problem of two-party privacy preserving DBSCAN clustering. We first propose two protocols for privacy preserving DBSCAN clustering over horizontally and vertically partitioned data respectively and then extend them to arbitrarily partitioned data. We also provide analysis of the performance and proof of privacy of our solution.
- R. Agrawal and R. Srikant. Privacy-preserving data mining. In SIGMOD Conference, pages 439--450, 2000. Google ScholarDigital Library
- M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander. Optics: Ordering points to identify the clustering structure. In SIGMOD Conference, pages 49--60, 1999. Google ScholarDigital Library
- R. Canetti, U. Feige, O. Goldreich, and M. Naor. Adaptively secure multi-party computation. In STOC, pages 639--648, 1996. Google ScholarDigital Library
- W. Du and Z. Zhan. Using randomized response techniques for privacy-preserving data mining. In KDD, pages 505--510, 2003. Google ScholarDigital Library
- M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD, pages 226--231, 1996.Google ScholarDigital Library
- O. Goldreich. Foundations of cryptography: Basic applications, volume 2. Cambridge Univ Pr, 2004. Google ScholarDigital Library
- A. Hinneburg and D. A. Keim. An efficient approach to clustering in large multimedia databases with noise. In KDD, pages 58--65, 1998.Google ScholarDigital Library
- G. Jagannathan and R. N. Wright. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In KDD, pages 593--599, 2005. Google ScholarDigital Library
- K. A. Kumar and C. P. Rangan. Privacy preserving dbscan algorithm for clustering. In ADMA, pages 57--68, 2007. Google ScholarDigital Library
- Y. Lindell and B. Pinkas. Privacy preserving data mining. J. Cryptology, 15(3):177--206, 2002.Google ScholarDigital Library
- R. T. Ng and J. Han. Efficient and effective clustering methods for spatial data mining. In VLDB, pages 144--155, 1994. Google ScholarDigital Library
- R. Ostrovsky, Y. Rabani, L. J. Schulman, and C. Swamy. The effectiveness of lloyd-type methods for the k-means problem. In FOCS, pages 165--176, 2006. Google ScholarDigital Library
- P. Paillier. Public-key cryptosystems based on composite degree residuosity classes. In EUROCRYPT, pages 223--238, 1999. Google ScholarDigital Library
- J. Vaidya and C. Clifton. Privacy-preserving κ-means clustering over vertically partitioned data. In KDD, pages 206--215, 2003. Google ScholarDigital Library
- A. C.-C. Yao. Protocols for secure computations (extended abstract). In FOCS, pages 160--164, 1982. Google ScholarDigital Library
- A. C.-C. Yao. How to generate and exchange secrets (extended abstract). In FOCS, pages 162--167, 1986. Google ScholarDigital Library
Recommendations
Privacy-Preserving Hierarchical-k-means Clustering on Horizontally Partitioned Data
Privacy preserving mining of distributed data is an important direction for data mining, and privacy preserving clustering is one of the main researches. Privacy-preserving data mining techniques enable knowledge discovery without requiring disclosure ...
Privacy Preserving Distributed DBSCAN Clustering
DBSCAN is a well-known density-based clustering algorithm which offers advantages for finding clusters of arbitrary shapes compared to partitioning and hierarchical clustering methods. However, there are few papers studying the DBSCAN algorithm under ...
Practical multi-party private collaborative k-means clustering
Abstractk-means clustering is widely used in many fields such as data mining, machine learning, and information retrieval. In many cases, users need to cooperate to perform k-means clustering tasks. How to perform clustering without revealing ...
Comments