skip to main content
10.1145/1005140.1005159acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
Article

Privacy preserving database application testing

Published: 30 October 2003 Publication History

Abstract

Traditionally, application software developers carry out their tests on their own local development databases. However, such local databases usually have only a small number of sample data and hence cannot simulate satisfactorily a live environment, especially in terms of performance and scalability testing. On the other hand, the idea of testing applications over live production databases is increasingly problematic in most situations primarily due to the fact that such use of live production databases has the potential to expose sensitive data to an unauthorized tester and to incorrectly update information in the underlying database. In this paper, we investigate techniques to generate mock databases for application software testing without revealing any confidential information from the live production databases. Specifically, we will design mechanisms to create the deterministic rule set R, non-deterministic rule set N R, and statistic data set S for a live production database. We will then build a security Analyzer which will process the triplet <R',N R',S'> together with security requirements (security policy) and output a new triplet <R',N R',S'> The security Analyzer will guarantee that no confidential information could be inferred from the new triplet <R',N R',S'> The mock database generated from this new triplet can simulate the live environment for testing purpose, while maintaining the privacy of data in the original database.

References

[1]
N. R. Adam, and J. C. Wortman. Security-control methods for statistical databases. ACM Computing Surveys, 21(4):515--556, Dec. 1989.]]
[2]
R. Agrawal, and R. Srikant. Privacy-preserving data mining. In Proceedings of ACM SIGMOD Conference on Management of Data, pp. 439--450, Dallas, Texas, May 2000.]]
[3]
L. Brankovic, and V. Estivill-Castro. Privacy issues in knowledge discovery and data mining. In Proceedings of 1st Australian Institute of Computer Ethics Conference, July, 1999.]]
[4]
D. Chays, S. Dan, P. Frankl, F. Vokolos, E. Weyuker. A framework for testing database applications. In Proceedings of International Symposium on Software Testing and Analysis, Portland, Oregon, August 2000.]]
[5]
B. Chor, O. Goldreich, E. Kushilevitz, and M. Sudan. Private information retrieval. FOCS 1995.]]
[6]
Y. Gertner, Y. Ishai, E. Kushilevitz, and T. Malkin. Protecting data privacy in private information retrieval schemes. JCSS 60 (3):592--629 (2000).]]
[7]
R. A. Davies, R. J. Beynon, and B. F. Jones. Automating the testing of databases. In Proceedings of the first International Workshop on Automated Program Analysis, Testing and Verification, June 2000.]]
[8]
I. Dinur and K. Nissim. Revealing information while preserving privacy. In: Proc. 22nd ACM PODS, pages 202--210, ACM Press, 2003.]]
[9]
J. Domingo-Ferrer. Current directions in statistical data protection. In Proceeding of Statistical Data Protection, 1998.]]
[10]
V. Estivill-Castro, and L. Brankovic. Data swapping: balancing privacy against precision in mining logical rules. In Proceedings of International Conference of Data Warehousing and Knowledge Discovery, 1999.]]
[11]
A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. Privacy Preserving Mining of Association Rules. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Canada, July 2002.]]
[12]
O. Goldreich. Foundation of Cryptography | Basic Tools. Cambridge University Press, 2001.]]
[13]
S. Goldwasser, S. Micali, and C. Rackoff. The knowledge complexity of interactive proof systems. SIAM J. Computing18:186--208, 1989.]]
[14]
S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Computer and System Science28 (2):270--299, 1984.]]
[15]
A. Gotlieb, B. Botella, and M. Rueher. Automatic test data generation using constraint solving techniques. In Proceedings of the 1998 International Symposium on Software Testing and Analysis, pp. 53--62, March 1998.]]
[16]
J. Gray, P. Sundaresan, S. Englert, K. Baclawski, and P. J. Weinberger. Quickly generating billion-records synthetic databases. In ACM SIGMOD, pp. 243--252, June 1994.]]
[17]
M. Kantarcioglu, and C. Clifton. Privacy preserving distributed mining of association rules on horizontally partitioned data. In ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, pp. 24--31, June 2002.]]
[18]
J. J. Kim. A method for limiting disclosure in microdata based on random noise and transformation. In Proceedings of the section on survey research methods, American Statistical Association, 1986.]]
[19]
J. J. Kim, and W. E. Winkler. Masking microdata files. Report of Bureau of the Census, 1997.]]
[20]
S. Kirkpatrick, S. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science 220(4958):671--680.]]
[21]
Y. Lindell, and B. Pinkas. Privacy preserving data mining. In CRYPTO, pp. 36--54, 2000.]]
[22]
Niagara. http://www.cs.wisc.edu/niagara/datagendownload.html]]
[23]
W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. Numerical recipes in C, the art of scientific computing. Cambridge University Press, 1988.]]
[24]
Quest. http://www.almaden.ibm.com/software/quest/]]
[25]
S. Rizvi, and J. Haritsa. Privacy preserving association rule mining. In Proceedings of 28th International Conference on Very Large Data Bases. Aug, 2002.]]
[26]
C. J. Skinner. On identification disclosure and prediction disclosure for microdata. Statistica Neerlandica, 44:21--32, 1992.]]
[27]
M. Stonebraker, and L. Rowe. The design of postgres. In Proceedings of ACM-SIGMOD International Conference on the Management of Data, June 1986.]]
[28]
B. Malin, L. Sweeney, and E. Newton. Trail re-identification: learning who you are from where you have been. Proc. LIDAP-WP12. Carnegie Mellon University, 2003.]]
[29]
Transaction Processing Performance Council. TPC-Benchmark C. 1998.]]
[30]
Edward Tsang. Foundations of constraint satisfaction. Academic Press, 1993.]]
[31]
J. Vaidya, and C. Clifton. Privacy preserving association rule mining in vertically partitioned data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Canada, July 2002.]]
[32]
J. Vaidya, and C. Clifton. Privacy preserving k-means clustering over vertically partitioned data. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 206--215, August 2003.]]
[33]
G. Wiederhold, and M. Bilello. Protecting inappropriate release of data from realistic databases. In Proceedings of the Ninth International Workshop on database and Expert Systems Applications, Vienna, Austria, 1998.]]
[34]
G. Wiederhold, M. Bilello, and C. Donahue. Web implementation of a security mediator for medical databases. In Proceedings of the Eleventh International Conference on Database Security, 1997.]]
[35]
A. Yao. How to generate and exchange secrets. In Proceedings of the 27th IEEE symposium on Foundations of Computer Science, pp. 162--167, 1986.]]
[36]
A. Yao. Theory and application of trap-door functions. In Proc. of 23rd IEEE Symposium on Foundation of Computer Science, page 80--91, 1982.]]

Cited By

View all
  • (2019)Realistic Synthetic Data Generation: The ATEN FrameworkBiomedical Engineering Systems and Technologies10.1007/978-3-030-29196-9_25(497-523)Online publication date: 13-Aug-2019
  • (2017)Applying Combinatorial Testing to Data Mining Algorithms2017 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW.2017.46(253-261)Online publication date: Mar-2017
  • (2016)Synthetic Data GenerationData Privacy10.1201/9781315370910-8(141-154)Online publication date: 10-Oct-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WPES '03: Proceedings of the 2003 ACM workshop on Privacy in the electronic society
October 2003
135 pages
ISBN:1581137761
DOI:10.1145/1005140
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2003

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. database application testing
  2. indistinguishability
  3. privacy

Qualifiers

  • Article

Conference

CCS03
Sponsor:

Acceptance Rates

Overall Acceptance Rate 106 of 355 submissions, 30%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Realistic Synthetic Data Generation: The ATEN FrameworkBiomedical Engineering Systems and Technologies10.1007/978-3-030-29196-9_25(497-523)Online publication date: 13-Aug-2019
  • (2017)Applying Combinatorial Testing to Data Mining Algorithms2017 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW.2017.46(253-261)Online publication date: Mar-2017
  • (2016)Synthetic Data GenerationData Privacy10.1201/9781315370910-8(141-154)Online publication date: 10-Oct-2016
  • (2014)Constrained obfuscation of relational databasesInformation Sciences: an International Journal10.1016/j.ins.2014.07.009286(35-62)Online publication date: 1-Dec-2014
  • (2009)Problem Space and Special Characteristics of Security Testing in Live and Operational Environments of Large Systems Exemplified by a Nationwide IT InfrastructureProceedings of the 2009 First International Conference on Advances in System Testing and Validation Lifecycle10.1109/VALID.2009.24(161-166)Online publication date: 20-Sep-2009
  • (2008)A case study in database reliabilityProceedings of the 1st international workshop on Testing database systems10.1145/1385269.1385283(1-6)Online publication date: 13-Jun-2008
  • (2007)Privacy Preserving Database Generation for Database Application TestingFundamenta Informaticae10.5555/2366516.236652578:4(595-612)Online publication date: 1-Dec-2007
  • (2007)Privacy Preserving Database Generation for Database Application TestingFundamenta Informaticae10.5555/1366038.136604778:4(595-612)Online publication date: 1-Dec-2007
  • (2007)Privacy preserving database access through dynamic privacy filters with stable data randomization2007 IEEE International Conference on Systems, Man and Cybernetics10.1109/ICSMC.2007.4414178(3333-3338)Online publication date: Oct-2007
  • (2006)A research agenda for distributed software developmentProceedings of the 28th international conference on Software engineering10.1145/1134285.1134402(731-740)Online publication date: 28-May-2006
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media