skip to main content
article

Utility-based anonymization for privacy preservation with less information loss

Published:01 December 2006Publication History
Skip Abstract Section

Abstract

Privacy becomes a more and more serious concern in applications involving microdata. Recently, efficient anonymization has attracted much research work. Most of the previous methods use global recoding, which maps the domains of the quasi-identifier attributes to generalized or changed values. However, global recoding may not always achieve effective anonymization in terms of discernability and query answering accuracy using the anonymized data. Moreover, anonymized data is often used for analysis. As well accepted in many analytical applications, different attributes in a data set may have different utility in the analysis. The utility of attributes has not been considered in the previous methods.

In this paper, we study the problem of utility-based anonymization. First, we propose a simple framework to specify utility of attributes. The framework covers both numeric and categorical data. Second, we develop two simple yet efficient heuristic local recoding methods for utility-based anonymization. Our extensive performance study using both real data sets and synthetic data sets shows that our methods outperform the state-of-the-art multidimensional global recoding methods in both discernability and query answering accuracy. Furthermore, our utility-based method can boost the quality of analysis using the anonymized data.

References

  1. C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB '05: Proceedings of the 31st international conference on Very large data bases, pages 901--909. VLDB Endowment, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Anonymizing tables. In ICDT, pages 246--258, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Approximation algorithms for k-anonymity. Journal of Privacy Technology, (2005112001), 2005.Google ScholarGoogle Scholar
  4. R. J. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In Proceedings of the 21st International Conference on Data Engineering (ICDE'05), pages 217--228, Tokyo, Japan, April 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509--517, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. C. M. Fung, K. Wang, and P. S. Yu. Top-down specialization for information and privacy preservation. In Proceedings of the 21st International Conference on Data Engineering (ICDE'05), pages 205--216, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. V. S. Iyengar. Transforming data to satisfy privacy constraints. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (KDD'02), pages 279--288, New York, NY, USA, 2002. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In SIGMOD Conference, pages 49--60, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In Proceedings of the 22nd International Conference on Data Engineering (ICDE'06), Atlanta, GA, USA, April 2006. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS'04), pages 223--228, New York, NY, USA, 2004. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Samarati. Protecting respondents' identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6): 1010--1027, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Samarati and L. Sweeney. Generalizing data to provide anonymity when disclosing information. In Proceedings of the 17th ACM Symposium on the Principle of Database Systems, Seattle, WA, June 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In Technical Report SRI-CSL-98-04, 1998.Google ScholarGoogle Scholar
  14. L. Sweeney. Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness, and Knowledge-based Systems, 10(5):571--588, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. Sweeney. K-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness, and Knowledge-based Systems, 10(5):571--588, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. K. Wang, P. S. Yu, and S. Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. In Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM'04), pages 249--256, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Willenborg and T. deWaal. Elements of Statistical Disclosure Control. Lecture Notes in Statistics. Springer Verlag, 2000.Google ScholarGoogle Scholar
  18. W. E. Winkler. Using simulated annealing for k-anonymity. In Technical Report Statistics 2002-7, U.S. Census Bureau, Statistical Research Division, 2002.Google ScholarGoogle Scholar

Index Terms

  1. Utility-based anonymization for privacy preservation with less information loss

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader