ACM Home Page
Please provide us with feedback. Feedback
An efficient hash-based algorithm for minimal k-anonymity
Full text PdfPdf (307 KB)
Source ACM International Conference Proceeding Series; Vol. 312 archive
Proceedings of the thirty-first Australasian conference on Computer science - Volume 74 table of contents
Wollongong, Australia
SESSION: Contributed papers: algorithms table of contents
Pages 101-107  
Year of Publication: 2008
ISBN ~ ISSN:1445-1336 , 978-1-920682-55-2
Authors
Xiaoxun Sun  University of Southern Queensland, Toowoomba, Queensland, Australia
Min Li  University of Southern Queensland, Toowoomba, Queensland, Australia
Hua Wang  University of Southern Queensland, Toowoomba, Queensland, Australia
Ashley Plank  University of Southern Queensland, Toowoomba, Queensland, Australia
Sponsors
: CORE - Computing Research and Education
: Macquarie University-Sydney
: University of Wollongong, Australia
Australian Comp Soc : Australian Computer Society
: University of Auckland, New Zealand
Publisher
Australian Computer Society, Inc.  Darlinghurst, Australia, Australia
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 23,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   

ABSTRACT

A number of organizations publish microdata for purposes such as public health and demographic research. Although attributes of microdata that clearly identify individuals, such as name and medical care card number, are generally removed, these databases can sometimes be joined with other public databases on attributes such as Zip code, Gender and Age to re-identify individuals who were supposed to remain anonymous. "Linking" attacks are made easier by the availability of other complementary databases over the Internet.

k-anonymity is a technique that prevents "linking" attacks by generalizing and/or suppressing portions of the released microdata so that no individual can be uniquely distinguished from a group of size k. In this paper, we investigate a practical model of k-anonymity, called full-domain generalization. We examine the issue of computing minimal k-anonymous table based on the definition of minimality described by Samarati. We introduce the hash-based technique previously used in mining associate rules and present an efficient hash-based algorithm to find the minimal k-anonymous table, which improves the previous binary search algorithm first proposed by Samarati.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Adam, N. R. and Wortman, J. C. Security-Control Methods for Statistical Databases: A Comparative Study, ACM Computing Surveys, vol. 21, no. 4, pp. 515--556, 1989.
 
2
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D. and Zhu, A. Anonymizing tables. In Proc. of the 10th International Conference on Database Theory (ICDT05), pp. 246--258, Edinburgh, Scotland.
 
3
Aggarwal, G., Feder, T., Kenthapadi, K., Panigrahy, R., Thomas, D. and Zhu, A. Achieving anonymity via clustering in a metric space. In Proceedings of the 25th ACM SIGACTSIGMOD- SIGART Symposium on Principles of Database Systems (PODS), 2006.
 
4
Agrawal, R and Srikant, R. Fast algorithms for mining association rules. In Proc. of the 20th Int'l Conference on Very Large Databases, August 1994.
 
5
Bayardo, R. and Agrawal, R. Data privacy through optimal k-anonymity. In Proceedings of the 21st International Conference on Data Engineering (ICDE), 2005.
 
6
Cormen, T. H., Leiserson, C. E., Rivest, R. L., Stein, C. Introduction to Algorithms, second edition, MIT Press and McGraw-Hill. ISBN 0-262-53196-8.
 
7
Domingo-Ferrer, J and Mateo-Sanz, J. M. Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering, 4(1), 2002.
 
8
Federal Committee on Statistical Methodology, Statistical Policy Working Paper 22, Report on Statistical Disclosure Limitation Methodology, May 1994.
 
9
Fung B, Wang K and Yu P. Top-down specialization for information and privacy preservation. In Proc. of the 21st International Conference on Data Engineering (ICDE'05), Tokyo, Japan.
 
10
Hundepool, A. and Willenborg, L. μ and τ-ARGUS: Software for statistical disclosure control. In Proc. of the Third International Seminar on Statistical Confidentiality, 1996.
 
11
Iyengar V. Transforming data to satisfy privacy constraints. In Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 279--288, Edmonton, Alberta, Canada, 2002.
 
12
LeFevre K, DeWitt DJ, Ramakrishnan R. Incognito: Efficient fulldomain k-anonymity. In Proc. of the 24th ACM SIGMOD International Conference on Management of Data, pp. 49--60, Baltimore, Maryland, USA, 2005.
 
13
LeFevre, K., DeWitt, D. and Ramakrishnan, R. Multidimensional k-anonymity. Technical Report 1521, University of Wisconsin, 2005.
 
14
Meyerson A and Williams R. On the complexity of optimal k-anonymity. In Proc. of the 23rd ACM-SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems, pp. 223--228, Paris, France, 2004.
 
15
Park, J. S., Chen, M. S. and Yu, P. S. An Effective Hash-Based Algorithm for Mining Association Rules. Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data. PP. 175--186, 1995.
 
16
Samarati, P and Sweeney, L. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical Report SRI-CSL-98-04, SRI Computer Science Laboratory, 1998.
 
17
Samarati P. Protecting respondents' identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010--1027. 2001
 
18
Samarati P, Sweeney L. Generalizing Data to Provide Anonymity when Disclosing Information (Abstract). In Proc. of ACM Symposium on Principles of Database Systems, pp. 188, 1998.
 
19
Srikant, R and Agrawal, R. Mining generalized association rules. In Proc. of the 21st Int'l Conference on Very Large Databases, August 1995.
 
20
Sweeney L. Achieving k-anonynity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 2002, 10(5):571--588.
 
21
Wang, K., Yu, P. S and Chakraborty, S. Bottom-up generalization: A data mining solution to privacy protection. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM), 2004.
 
22
Willenborg, L and DeWaal, T. Statistical Disclosure Control in Practice. Springer-Verlag, 1996.
 
23
Willenborg, L and DeWaal, T. Elements of Statistical Disclosure Control. Springer Verlag Lecture Notes in Statistics, 2000.
 
24
Winkler, W. Using simulated annealing for k-anonymity. Research Report 2002--07, US Census Bureau Statistical Research Division, 2002.

Collaborative Colleagues:
Xiaoxun Sun: colleagues
Min Li: colleagues
Hua Wang: colleagues
Ashley Plank: colleagues