skip to main content
10.1145/1150402.1150449acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Anonymizing sequential releases

Published: 20 August 2006 Publication History

Abstract

An organization makes a new release as new information become available, releases a tailored view for each data request, releases sensitive information and identifying information separately. The availability of related releases sharpens the identification of individuals by a global quasi-identifier consisting of attributes from related releases. Since it is not an option to anonymize previously released data, the current release must be anonymized to ensure that a global quasi-identifier is not effective for identification. In this paper, we study the sequential anonymization problem under this assumption. A key question is how to anonymize the current release so that it cannot be linked to previous releases yet remains useful for its own release purpose. We introduce the lossy join, a negative property in relational database design, as a way to hide the join relationship among releases, and propose a scalable and practical solution.

References

[1]
G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Anonymizing tables. In ICDT, 2005.
[2]
R. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In IEEE ICDE, pages 217--228, 2005.
[3]
C. Clifton. Using sample size to limit exposure to data mining. Journal of Computer Security, 8(4):281--307, 2000.
[4]
A. Deutsch and Y. Papakonstantinou. Privacy in database publishing. In ICDT, 2005.
[5]
B. C. M. Fung, K. Wang, and P. S. Yu. Top-down specialization for information and privacy preservation. In IEEE ICDE, pages 205--216, April 2005.
[6]
V. S. Iyengar. Transforming data to satisfy privacy constraints. In ACM SIGKDD, pages 279--288, 2002.
[7]
D. Kifer and J. Gehrke. Injecting utility into anonymized datasets. In ACM SIGMOD, Chicago, IL, June 2006.
[8]
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In ACM SIGMOD, 2005.
[9]
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In IEEE ICDE, 2006.
[10]
A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacy beyond k-anonymity. In IEEE ICDE, 2006.
[11]
B. Malin and L. Sweeney. How to protect genomic data privacy in a distributed network. In Journal of Biomed Info, 37(3): 179--192, 2004.
[12]
A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In PODS, 2004.
[13]
G. Miklau and D. Suciu. A formal analysis of information disclosure in data exchange. In ACM SIGMOD, 2004.
[14]
D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz. UCI repository of machine learning databases, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html.
[15]
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
[16]
P. Samarati. Protecting respondents' identities in microdata release. IEEE TKDE, 13(6):1010--1027, 2001.
[17]
P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In IEEE Symposium on Research in Security and Privacy, May 1998.
[18]
C. E. Shannon. A mathematical theory of communication. The Bell System Technical Journal, 27:379 and 623, 1948.
[19]
L. Sweeney. k-Anonymity: a model for protecting privacy. In International Journal on Uncertanty, Fuzziness and Knowledge-based Systems, 10(5), pages 557--570, 2002.
[20]
K. Wang, B. C. M. Fung, and G. Dong. Integrating private databases for data analysis. In IEEE ISI, May 2005.
[21]
K. Wang, B. C. M. Fung, and P. S. Yu. Template-based privacy preservation in classification problems. In IEEE ICDM, pages 466--473, November 2005.
[22]
K. Wang, B. C. M. Fung, and P. S. Yu. Handicapping attacker's confidence: An alternative to k-anonymization. Knowledge and Information Systems: An International Journal, 2006.
[23]
K. Wang, P. S. Yu, and S. Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. In IEEE ICDM, November 2004.
[24]
R. C. W. Wong, J. Li., A. W. C. Fu, and K. Wang. (α,k)-anonymity: An enhanced k-anonymity model for privacy preserving data publishing. In ACM SIGKDD, 2006.
[25]
X. Xiao and Y. Tao. Personalized privacy preservation. In ACM SIGMOD, June 2006.
[26]
C. Yao, X. S. Wang, and S. Jajodia. Checking for k-anonymity violation by views. In VLDB, 2005.

Cited By

View all
  • (2024)Compromising anonymity in identity-reserved k-anonymous datasets through aggregate knowledgeProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3664489(1-12)Online publication date: 30-Jul-2024
  • (2024)A Taxonomy of Syntactic Privacy Notions for Continuous Data PublishingIEEE Access10.1109/ACCESS.2024.336885212(38490-38511)Online publication date: 2024
  • (2024)On the Necessity of Counterfeits and Deletions for Continuous Data PublishingModeling Decisions for Artificial Intelligence10.1007/978-3-031-68208-7_17(199-210)Online publication date: 15-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2006
986 pages
ISBN:1595933395
DOI:10.1145/1150402
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. classification
  2. generalization
  3. k-anonymity
  4. privacy
  5. sequential release

Qualifiers

  • Article

Conference

KDD06

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Compromising anonymity in identity-reserved k-anonymous datasets through aggregate knowledgeProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3664489(1-12)Online publication date: 30-Jul-2024
  • (2024)A Taxonomy of Syntactic Privacy Notions for Continuous Data PublishingIEEE Access10.1109/ACCESS.2024.336885212(38490-38511)Online publication date: 2024
  • (2024)On the Necessity of Counterfeits and Deletions for Continuous Data PublishingModeling Decisions for Artificial Intelligence10.1007/978-3-031-68208-7_17(199-210)Online publication date: 15-Aug-2024
  • (2023)APPLICATION OF COMPUTER SIMULATION TO THE ANONYMIZATION OF PERSONAL DATA: STATE-OF-THE-ART AND KEY POINTSПрограммирование10.31857/S0132347423040040(58-74)Online publication date: 1-Jul-2023
  • (2023)Survey on Privacy-Preserving Techniques for Microdata PublicationACM Computing Surveys10.1145/358876555:14s(1-42)Online publication date: 28-Mar-2023
  • (2023)Application of Computer Simulation to the Anonymization of Personal Data: State-of-the-Art and Key PointsProgramming and Computer Software10.1134/S036176882304004749:4(232-246)Online publication date: 28-Jul-2023
  • (2023)Global Combination and Clustering Based Differential Privacy Mixed Data PublishingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.323782235:11(11437-11448)Online publication date: 1-Nov-2023
  • (2023)A New Global Measure to Simultaneously Evaluate Data Utility and Privacy RiskIEEE Transactions on Information Forensics and Security10.1109/TIFS.2022.322875318(715-729)Online publication date: 2023
  • (2023)(X, Y)-PrivacyEncyclopedia of Cryptography, Security and Privacy10.1007/978-3-642-27739-9_1715-1(1-4)Online publication date: 29-Apr-2023
  • (2022)An Adaptive Lattice of Full-Domain Generalization for Sequential Data Publishing2022 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON)10.1109/ECTIDAMTNCON53731.2022.9720350(354-358)Online publication date: 26-Jan-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media