| Optimal randomization for privacy preserving data mining |
| Full text |
Pdf
(182 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Seattle, WA, USA
POSTER SESSION: Research track posters
table of contents
Pages: 761 - 766
Year of Publication: 2004
ISBN:1-58113-888-1
|
|
Authors
|
|
Yu Zhu
|
Purdue University, W. Lafayette, IN
|
|
Lei Liu
|
Purdue University, W. Lafayette, IN
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 25, Downloads (12 Months): 120, Citation Count: 2
|
|
|
ABSTRACT
Randomization is an economical and efficient approach for privacy preserving data mining (PPDM). In order to guarantee the performance of data mining and the protection of individual privacy, optimal randomization schemes need to be employed. This paper demonstrates the construction of optimal randomization schemes for privacy preserving density estimation. We propose a general framework for randomization using mixture models. The impact of randomization on data mining is quantified by performance degradation and mutual information loss, while privacy and privacy loss are quantified by interval-based metrics. Two different types of problems are defined to identify optimal randomization for PPDM. Illustrative examples and simulation results are reported.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
D. Bohning. A review of reliable maximum likelihood algorithms for semiparametric mixture models. J. Statist. Plann. Inference, 47:5--28, 1995.
|
| |
4
|
The Economist. The end of Privacy. May, 1999.
|
 |
5
|
|
 |
6
|
Alexandre Evfimievski , Ramakrishnan Srikant , Rakesh Agrawal , Johannes Gehrke, Privacy preserving mining of association rules, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
[doi> 10.1145/775047.775080]
|
| |
7
|
J. Fan. Global behavior of deconvolution kernel estimates. Statistica Sinica, 1:541--551, 1991.
|
| |
8
|
J. Fan. On the optimal rates of convergence for nonparametric deconvolution problem. Annals of Statistics, pages 1257--1272, 1991.
|
| |
9
|
|
| |
10
|
|
| |
11
|
B.G. Lindsay. Mixture Models: Theory, Geometry and Applications, NSF-CBMS Regional Conference Series in Probability and Statistics, Vol. 5. Alexandria, Virginia: Institute of Mathematical Statistics and the American Statistical Association, 1995.
|
| |
12
|
R. A. Redner and H. F. Walker. Mixture densities, maximum likelihood and the EM algorithm. SIAM Review, 26(2):195--239, 1984.
|
| |
13
|
K. Thearling. Data mining and privacy: a conflict in making. DS, November 1998.
|
 |
14
|
|
|