Article

Privacy-enhancing k-anonymization of customer data

Authors:
Sheng Zhong

Stevens Institute of Technology, Hoboken, NJ

Stevens Institute of Technology, Hoboken, NJ
View Profile

,
Zhiqiang Yang

Stevens Institute of Technology, Hoboken, NJ

Stevens Institute of Technology, Hoboken, NJ
View Profile

,
Rebecca N. Wright

Stevens Institute of Technology, Hoboken, NJ

Stevens Institute of Technology, Hoboken, NJ
View Profile

PODS '05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsJune 2005Pages 139–147https://doi.org/10.1145/1065167.1065185

Published:13 June 2005Publication History

PODS '05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

Pages 139–147

ABSTRACT

In order to protect individuals' privacy, the technique of k-anonymization has been proposed to de-associate sensitive attributes from the corresponding identifiers. In this paper, we provide privacy-enhancing methods for creating k-anonymous tables in a distributed scenario. Specifically, we consider a setting in which there is a set of customers, each of whom has a row of a table, and a miner, who wants to mine the entire table. Our objective is to design protocols that allow the miner to obtain a k-anonymous table representing the customer data, in such a way that does not reveal any extra information that can be used to link sensitive attributes to corresponding identifiers, and without requiring a central authority who has access to all the original data. We give two different formulations of this problem, with provably private solutions. Our solutions enhance the privacy of k-anonymization in the distributed scenario by maintaining end-to-end privacy from the original customer data to the final k-anonymous results.

References

J. O. Achugbue and F. Y. Chin. The effectiveness of output modification by rounding for protection of statistical databases. INFOR, 17(3):209--218, 1979.]]Google ScholarCross Ref
N. Adam and J. Worthmann. Security-control methods for statistical databases: a comparative study. ACM Computing Survey, 21(4):515--556, 1989.]] Google ScholarDigital Library
C. C. Aggarwal and P. S. Yu. A condensation approach to privacy preserving data mining. In Proceedings of 9th International Conference on Extending Database technology. Springer, 2004.]]Google ScholarCross Ref
G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. k-anonymity: Algorithms and hardness. Under review, 2004.]]Google Scholar
D. Agrawal and C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. In Proceedings of 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 247--255, 2001.]] Google ScholarDigital Library
R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proceedings of 19th ACM SIGMOD Conference on Management of Data, pages 439--450. ACM Press, May 2000.]] Google ScholarDigital Library
W. Aiello, Y. Ishai, and O. Reingold. Priced oblivious transfer: How to sell digital goods. In Advances in Cryptology - Proceedings of EUROCRYPT 2001, volume 2045 of Lecture Notes in Computer Science, pages 119--135. Springer-Verlag, 2001.]] Google ScholarDigital Library
R. J. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In Proceedings of 21st International Conference on Data Engineering, 2005.]] Google ScholarDigital Library
L. L. Beck. A security mechanism for statistical databases. ACM Transactions on Database Systems, 5(3):316--338, September 1980.]] Google ScholarDigital Library
D. Boneh. The decision Diffie-Hellman problem. In Algorithmic Number Theory, Third International Symposium, volume 1423 of Lecture Notes in Computer Science, pages 48--63. Springer-Verlag, 1998.]] Google ScholarDigital Library
F. Y. Chin and G. Ozsoyoglu. Auditing and inference control in statistical databases. IEEE Transactions on Software Engineering, SE-8(6):113--139, April 1982.]]Google ScholarDigital Library
T. Dalenius. Finding a needle in a haystack-or identifying anonymous census record. Journal of Official Statistics, 2(3):329--336, 1986.]]Google Scholar
Y. Desmedt and Y. Frankel. Threshold cryptosystems. In Advances in Cryptology - Proceedings of CRYPTO 89, volume 435 of Lecture Notes in Computer Science, pages 307--315. Springer-Verlag, 1990.]] Google ScholarDigital Library
I. Dinur and K. Nissim. Revealing information while preserving privacy. In Proceedings of 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 202--210. ACM Press, 2003.]] Google ScholarDigital Library
T. ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. In Advances in Cryptology - Proceedings of CRYPTO 84, pages 10--18, 1985.]] Google ScholarDigital Library
A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In Proceedings of 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 211--222. ACM Press, 2003.]] Google ScholarDigital Library
A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. Privacy preserving mining of association rules. In Proceedings of 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 217--228. ACM Press, 2002.]] Google ScholarDigital Library
R. Gennaro, S. Jarecki, H. Krawczyk, and T. Rabin. Secure applications of Pedersen's distributed key generation protocol. In CT-RSA 2003, volume 2612 of Lecture Notes in Computer Science, pages 373--390, 2003.]] Google ScholarDigital Library
O. Goldreich. Foundations of Cryptography, volume 2. Cambridge University Press, 2004.]] Google ScholarCross Ref
M. Kantarcioglu and C. Clifton. Privacy preserving distributed mining of association rules on horizontally partitioned data. In ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pages 639--644. ACM, 2002.]]Google Scholar
H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar. On the privacy preserving properties of random data perturbation techniques. In Proceedings of 3rd IEEE International Conference on Data Mining, Florida, Nov 2003.]] Google ScholarDigital Library
J. M. Kleinberg, C. H. Papadimitriou, and P. Raghavan. Auditing boolean attributes. In Proceedings of 19th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 86--91, 2000.]] Google ScholarDigital Library
Y. Lindell and B. Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3):177--206, 2002.]]Google ScholarDigital Library
A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In Proceedings of 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Paris, France, June 2004.]] Google ScholarDigital Library
S. Reiss. Practical data swapping: The first steps. ACM Transactions on Database Systems, 9(1):20--37, 1984.]] Google ScholarDigital Library
P. Samarati. Protecting respondent's privacy in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010--1027, 2001.]] Google ScholarDigital Library
P. Samarati and L. Sweeney. Generalizing data to provide anonymity when disclosing information (abstract). In Proceedings of 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, page 188. ACM Press, 1998.]] Google ScholarDigital Library
A. Shamir. How to share a secret. Communications of the ACM, 22(11):612--613, 1979.]] Google ScholarDigital Library
A. Shoshani. Statistical databases: Characteristics, problems and some solutions. In Proceedings of 8th International Conference on Very Large Data Bases, pages 208--222, 1982.]] Google ScholarDigital Library
L. Sweeney. Guaranteeing anonymity when sharing medical data, the datafly system. In Proceedings, Journal of the American Medical Informatics Association, 1997.]]Google Scholar
L. Sweeney. Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl.-Based Syst., 10(5):571--588, 2002.]] Google ScholarDigital Library
L. Sweeney. k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl-Based Syst., 10(5):557--570, 2002.]] Google ScholarDigital Library
J. Traub, Y. Yemini, and H. Wozniakowksi. The statistical security of a statistical database. ACM Transactions on Database Systems, 9(4):672--679, 1984.]] Google ScholarDigital Library
J. Vaidya and C. Clifton. Privacy preserving association rule mining in vertically partitioned data. In Proceedings of 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 639--644, 2002.]] Google ScholarDigital Library
J. Vaidya and C. Clifton. Privacy-preserving k-means clustering over vertically partitioned data. In Proceedings of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 206--215. ACM Press, 2003.]] Google ScholarDigital Library

Recommendations

From t-closeness to differential privacy and vice versa in data anonymization

k-anonymity and ε-differential privacy are two mainstream privacy models, the former introduced to anonymize data sets and the latter to limit the knowledge gain that results from including one individual in the data set. Whereas basic k-anonymity only ...
Read More
IMR based Anonymization for Privacy Preservation in Data Mining
KMO '16: Proceedings of the The 11th International Knowledge Management in Organizations Conference on The changing face of Knowledge Management Impacting Society

Privacy Preserving Data Mining (PPDM) is a data mining research area that aims to protect individual's personal information from unsolicited or unauthorized disclosure. Privacy relates to personal information that a person would not wish others to know ...
Read More
Privacy consensus in anonymization systems via game theory
DBSec'12: Proceedings of the 26th Annual IFIP WG 11.3 conference on Data and Applications Security and Privacy

Privacy protection appears as a fundamental concern when personal data is collected, stored, and published. Several anonymization methods have been proposed to address privacy issues in private datasets. Every anonymization method has at least one ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PODS '05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2005
388 pages
ISBN:1595930620
DOI:10.1145/1065167
General Chair:
Georg Gottlob
Vienna University of Technology, Austria
,
Program Chair:
Foto Afrati
National Technical University of Athens, Greece
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 June 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate642of2,707submissions,24%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 78
  Total Citations
  View Citations
- 973
  Total Downloads
- Downloads (Last 12 months)27
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Privacy-enhancing k-anonymization of customer data

PODS '05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

ABSTRACT

References

Cited By

Recommendations

From t-closeness to differential privacy and vice versa in data anonymization

IMR based Anonymization for Privacy Preservation in Data Mining

Privacy consensus in anonymization systems via game theory

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Privacy-enhancing k-anonymization of customer data

PODS '05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

ABSTRACT

References

Cited By

Recommendations

From t-closeness to differential privacy and vice versa in data anonymization

IMR based Anonymization for Privacy Preservation in Data Mining

Privacy consensus in anonymization systems via game theory

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media