ACM Home Page
Please provide us with feedback. Feedback
Optimal randomization for privacy preserving data mining
Full text PdfPdf (182 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Seattle, WA, USA
POSTER SESSION: Research track posters table of contents
Pages: 761 - 766  
Year of Publication: 2004
ISBN:1-58113-888-1
Authors
Yu Zhu  Purdue University, W. Lafayette, IN
Lei Liu  Purdue University, W. Lafayette, IN
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 25,   Downloads (12 Months): 120,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1014052.1014153
What is a DOI?

ABSTRACT

Randomization is an economical and efficient approach for privacy preserving data mining (PPDM). In order to guarantee the performance of data mining and the protection of individual privacy, optimal randomization schemes need to be employed. This paper demonstrates the construction of optimal randomization schemes for privacy preserving density estimation. We propose a general framework for randomization using mixture models. The impact of randomization on data mining is quantified by performance degradation and mutual information loss, while privacy and privacy loss are quantified by interval-based metrics. Two different types of problems are defined to identify optimal randomization for PPDM. Illustrative examples and simulation results are reported.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
D. Bohning. A review of reliable maximum likelihood algorithms for semiparametric mixture models. J. Statist. Plann. Inference, 47:5--28, 1995.
 
4
The Economist. The end of Privacy. May, 1999.
5
6
 
7
J. Fan. Global behavior of deconvolution kernel estimates. Statistica Sinica, 1:541--551, 1991.
 
8
J. Fan. On the optimal rates of convergence for nonparametric deconvolution problem. Annals of Statistics, pages 1257--1272, 1991.
 
9
 
10
 
11
B.G. Lindsay. Mixture Models: Theory, Geometry and Applications, NSF-CBMS Regional Conference Series in Probability and Statistics, Vol. 5. Alexandria, Virginia: Institute of Mathematical Statistics and the American Statistical Association, 1995.
 
12
R. A. Redner and H. F. Walker. Mixture densities, maximum likelihood and the EM algorithm. SIAM Review, 26(2):195--239, 1984.
 
13
K. Thearling. Data mining and privacy: a conflict in making. DS, November 1998.
14