Article

Anonymizing sequential releases

Authors:

Benjamin C. M. FungAuthors Info & Claims

KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 414 - 423

https://doi.org/10.1145/1150402.1150449

Published: 20 August 2006 Publication History

Abstract

An organization makes a new release as new information become available, releases a tailored view for each data request, releases sensitive information and identifying information separately. The availability of related releases sharpens the identification of individuals by a global quasi-identifier consisting of attributes from related releases. Since it is not an option to anonymize previously released data, the current release must be anonymized to ensure that a global quasi-identifier is not effective for identification. In this paper, we study the sequential anonymization problem under this assumption. A key question is how to anonymize the current release so that it cannot be linked to previous releases yet remains useful for its own release purpose. We introduce the lossy join, a negative property in relational database design, as a way to hide the join relationship among releases, and propose a scalable and practical solution.

References

[1]

G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Anonymizing tables. In ICDT, 2005.

Digital Library

[2]

R. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In IEEE ICDE, pages 217--228, 2005.

Digital Library

[3]

C. Clifton. Using sample size to limit exposure to data mining. Journal of Computer Security, 8(4):281--307, 2000.

Digital Library

[4]

A. Deutsch and Y. Papakonstantinou. Privacy in database publishing. In ICDT, 2005.

Digital Library

[5]

B. C. M. Fung, K. Wang, and P. S. Yu. Top-down specialization for information and privacy preservation. In IEEE ICDE, pages 205--216, April 2005.

Digital Library

[6]

V. S. Iyengar. Transforming data to satisfy privacy constraints. In ACM SIGKDD, pages 279--288, 2002.

Digital Library

[7]

D. Kifer and J. Gehrke. Injecting utility into anonymized datasets. In ACM SIGMOD, Chicago, IL, June 2006.

Digital Library

[8]

K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In ACM SIGMOD, 2005.

Digital Library

[9]

K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In IEEE ICDE, 2006.

Digital Library

[10]

A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacy beyond k-anonymity. In IEEE ICDE, 2006.

Digital Library

[11]

B. Malin and L. Sweeney. How to protect genomic data privacy in a distributed network. In Journal of Biomed Info, 37(3): 179--192, 2004.

Digital Library

[12]

A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In PODS, 2004.

Digital Library

[13]

G. Miklau and D. Suciu. A formal analysis of information disclosure in data exchange. In ACM SIGMOD, 2004.

Digital Library

[14]

D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz. UCI repository of machine learning databases, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html.

[15]

J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.

Digital Library

[16]

P. Samarati. Protecting respondents' identities in microdata release. IEEE TKDE, 13(6):1010--1027, 2001.

Digital Library

[17]

P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In IEEE Symposium on Research in Security and Privacy, May 1998.

[18]

C. E. Shannon. A mathematical theory of communication. The Bell System Technical Journal, 27:379 and 623, 1948.

[19]

L. Sweeney. k-Anonymity: a model for protecting privacy. In International Journal on Uncertanty, Fuzziness and Knowledge-based Systems, 10(5), pages 557--570, 2002.

Digital Library

[20]

K. Wang, B. C. M. Fung, and G. Dong. Integrating private databases for data analysis. In IEEE ISI, May 2005.

Digital Library

[21]

K. Wang, B. C. M. Fung, and P. S. Yu. Template-based privacy preservation in classification problems. In IEEE ICDM, pages 466--473, November 2005.

Digital Library

[22]

K. Wang, B. C. M. Fung, and P. S. Yu. Handicapping attacker's confidence: An alternative to k-anonymization. Knowledge and Information Systems: An International Journal, 2006.

Digital Library

[23]

K. Wang, P. S. Yu, and S. Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. In IEEE ICDM, November 2004.

Digital Library

[24]

R. C. W. Wong, J. Li., A. W. C. Fu, and K. Wang. (α,k)-anonymity: An enhanced k-anonymity model for privacy preserving data publishing. In ACM SIGKDD, 2006.

Digital Library

[25]

X. Xiao and Y. Tao. Personalized privacy preservation. In ACM SIGMOD, June 2006.

Digital Library

[26]

C. Yao, X. S. Wang, and S. Jajodia. Checking for k-anonymity violation by views. In VLDB, 2005.

Digital Library

Cited By

De Boeck KVerdonck JWillocx MLapon JNaessens V(2024)Compromising anonymity in identity-reserved k-anonymous datasets through aggregate knowledgeProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3664489(1-12)Online publication date: 30-Jul-2024
https://dl.acm.org/doi/10.1145/3664476.3664489
Nicolau AParra-Arnau JForné J(2024)A Taxonomy of Syntactic Privacy Notions for Continuous Data PublishingIEEE Access10.1109/ACCESS.2024.336885212(38490-38511)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3368852
Tobar Nicolau AParra-Arnau JForné J(2024)On the Necessity of Counterfeits and Deletions for Continuous Data PublishingModeling Decisions for Artificial Intelligence10.1007/978-3-031-68208-7_17(199-210)Online publication date: 15-Aug-2024
https://doi.org/10.1007/978-3-031-68208-7_17
Show More Cited By

Index Terms

Anonymizing sequential releases

Recommendations

Privacy by diversity in sequential releases of databases

We study the problem of privacy preservation in sequential releases of databases. In that scenario, several releases of the same table are published over a period of time, where each release contains a different set of the table attributes, as dictated ...
Anonymizing sequential releases under arbitrary updates
EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 Workshops

In today's global information society, governments, companies, public and private institutions and even individuals have to cope with growing demands for personal data publication from scientists, statisticians, journalists and many other data ...
Limiting disclosure of sensitive data in sequential releases of databases

Privacy Preserving Data Publishing (PPDP) is a research field that deals with the development of methods to enable publishing of data while minimizing distortion, for maintaining usability on one hand, and respecting privacy on the other hand. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2006

986 pages

ISBN:1595933395

DOI:10.1145/1150402

Conference Chair:
Tina Eliassi-Rad
LLNL
,
General Chair:
Lyle Ungar
University of Pennsylvania
,
Program Chairs:
Mark Craven
University of Wisconsin
,
Dimitrios Gunopulos
University of California, Riverside

Copyright © 2006 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

KDD06

Sponsor:

KDD06: The 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 20 - 23, 2006

PA, Philadelphia, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

174
Total Citations
View Citations
1,307
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

De Boeck KVerdonck JWillocx MLapon JNaessens V(2024)Compromising anonymity in identity-reserved k-anonymous datasets through aggregate knowledgeProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3664489(1-12)Online publication date: 30-Jul-2024
https://dl.acm.org/doi/10.1145/3664476.3664489
Nicolau AParra-Arnau JForné J(2024)A Taxonomy of Syntactic Privacy Notions for Continuous Data PublishingIEEE Access10.1109/ACCESS.2024.336885212(38490-38511)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3368852
Tobar Nicolau AParra-Arnau JForné J(2024)On the Necessity of Counterfeits and Deletions for Continuous Data PublishingModeling Decisions for Artificial Intelligence10.1007/978-3-031-68208-7_17(199-210)Online publication date: 15-Aug-2024
https://doi.org/10.1007/978-3-031-68208-7_17
BORISOV ABOSOV AIVANOV A(2023)APPLICATION OF COMPUTER SIMULATION TO THE ANONYMIZATION OF PERSONAL DATA: STATE-OF-THE-ART AND KEY POINTSПрограммирование10.31857/S0132347423040040(58-74)Online publication date: 1-Jul-2023
https://doi.org/10.31857/S0132347423040040
Carvalho TMoniz NFaria PAntunes L(2023)Survey on Privacy-Preserving Techniques for Microdata PublicationACM Computing Surveys10.1145/358876555:14s(1-42)Online publication date: 28-Mar-2023
https://dl.acm.org/doi/10.1145/3588765
Borisov ABosov AIvanov A(2023)Application of Computer Simulation to the Anonymization of Personal Data: State-of-the-Art and Key PointsProgramming and Computer Software10.1134/S036176882304004749:4(232-246)Online publication date: 28-Jul-2023
https://doi.org/10.1134/S0361768823040047
Chen LZeng LMu YChen L(2023)Global Combination and Clustering Based Differential Privacy Mixed Data PublishingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.323782235:11(11437-11448)Online publication date: 1-Nov-2023
https://doi.org/10.1109/TKDE.2023.3237822
Jeong DKim JIm J(2023)A New Global Measure to Simultaneously Evaluate Data Utility and Privacy RiskIEEE Transactions on Information Forensics and Security10.1109/TIFS.2022.322875318(715-729)Online publication date: 2023
https://doi.org/10.1109/TIFS.2022.3228753
Al-Hussaeni KM. Fung B(2023)(X, Y)-PrivacyEncyclopedia of Cryptography, Security and Privacy10.1007/978-3-642-27739-9_1715-1(1-4)Online publication date: 29-Apr-2023
https://doi.org/10.1007/978-3-642-27739-9_1715-1
Soontornphand TSrisungsittisunti BMekruksavanich S(2022)An Adaptive Lattice of Full-Domain Generalization for Sequential Data Publishing2022 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON)10.1109/ECTIDAMTNCON53731.2022.9720350(354-358)Online publication date: 26-Jan-2022
https://doi.org/10.1109/ECTIDAMTNCON53731.2022.9720350
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents