skip to main content
article
Free Access

Secure statistical databases with random sample queries

Published:01 September 1980Publication History
Skip Abstract Section

Abstract

A new inference control, called random sample queries, is proposed for safeguarding confidential data in on-line statistical databases. The random sample queries control deals directly with the basic principle of compromise by making it impossible for a questioner to control precisely the formation of query sets. Queries for relative frequencies and averages are computed using random samples drawn from the query sets. The sampling strategy permits the release of accurate and timely statistics and can be implemented at very low cost. Analysis shows the relative error in the statistics decreases as the query set size increases; in contrast, the effort required to compromise increases with the query set size due to large absolute errors. Experiments performed on a simulated database support the analysis.

References

  1. 1 ACHUGBUE, J. O., AND CHIN, F.Y. Output perturbation for protection of statistical data bases. Dep. Computing Science, Univ. Alberta, Alberta, Canada, Jan. 1978.Google ScholarGoogle Scholar
  2. 2 BECK, L.L. A security mechanism for statistical databases. A CM Trans. Database Syst. 5, 3 (Sept. 1980), 316-338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 BORUCH, R.F. Maintaining confidentiality in educational research: A systematic analysis. Am. Psychol. 26 (1971), 413-430.Google ScholarGoogle ScholarCross RefCross Ref
  4. 4 CAMPBELL, D. T., BORUCH, R. F., SCHWARTZ, R. D., AND STEINBERG, J. Confidentialitypreserving modes of access to files and to interfile exchange for useful statistical analysis. Eval. Quart. 1, 2 (May 1977), 269-299.Google ScholarGoogle Scholar
  5. 5 CHIN, F.Y. Security in statistical databases for queries with small counts. ACM Trans. Database Syst. 3, 1 (March 1978), 92-104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 DALENIUS, T. Towards a methodology for statistical disclosure control. Sdrtryck ur Statistisk tidskrift 15 (1977), 429-444.Google ScholarGoogle Scholar
  7. 7 DALENIUS, T., AND REISS, S.P. Data-swapping--A technique for disclosure control. Confidentiality in Surveys, Rep. 31, Dep. Star., Univ. Stockholm, Stockholm, Sweden, May 1978.Google ScholarGoogle Scholar
  8. 8 DAVIDA, G. I., ET AL. Data base security. IEEE Trans. Softw. Eng. SE-4, 6 (Nov. 1978), 531- 533.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 DEMILLO, R. A., DOBKXN, D., AND LIPTON, R.J. Even data bases that lie can be compromised. IEEE Trans. Softw. Eng. SE-4, 1 (Jan. 1978), 73-75.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 DENNINg, D.E. A review of research on statistical database security. In Foundations of Secure Computation, R. A. DeMillo et al., Eds. Academic, New York, 1978.Google ScholarGoogle Scholar
  11. 11 DF~NNING, D.E. Are statistical data bases secure? Proc. AFIPS 1978 NCC, vol. 47, AFIPS Press, Arlington, Va., pp. 525-530.Google ScholarGoogle Scholar
  12. 12 DENNING, D. E., AND DENNING, P.J. Data security. Comput. Surv. 11, 3 (Sept. I979), 227-249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13 DENNING, D. E., DENNING, P. J., AND SCHWARTZ, M.D. The tracker: A threat to statistical database security. ACM Trans. Database Syst. 4, 1 (March 1979), 76-96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 DENNING, D. E., AND SCHLORER, J. A fast procedure for finding a tracker in a statistical database. ACM Trans. Database Syst. 5, 1 (March 1980), 88-102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 DENNING, D.E. Complexity results relating to statistical confidentiality. Computer Science and Statistics: 12th Ann. Symp. Interface, Waterloo, Canada, May 1979, pp. 252-256.Google ScholarGoogle Scholar
  16. 16 DOBKIN, D., JONES, A. K., AND LIPTON, R.J. Secure databases: Protection against user influence. ACM Trans. Database Syst. 4, 1 (March 1979), 97-I06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 FE{GE, E. L., AND WATTS, H. W. Protection of privacy through microaggregation. In Data Bases, Computers, and the Social Sciences, R. L. Bisco, Ed. Wiley-Interscience, New York, 1970.Google ScholarGoogle Scholar
  18. 18 FELLER, W. An Introduction to Probability Theory and Its Applications L Wiley, New York, I950.Google ScholarGoogle Scholar
  19. 19 FELLEGI, I. P., AND PHILLIPS, J.L. Statistical confidentiality: Some theory and applications to data dissemination. Ann. Econ. Soc. MeaN. 3, 2 (April 1974), 399-409.Google ScholarGoogle Scholar
  20. 20 HANSEN, M.H. Insuring confidentiality of individual records in data storage and retrieval for statistical purposes. Proc. AFIPS 1971 FJCC, vol. 39, AFIPS Press, Arlington, Va., pp. 579-585.Google ScholarGoogle Scholar
  21. 21 HAQ, M.I. On safeguarding statistical disclosure by giving approximate answers to queries. Int. Computing Symp., 1977, pp. 491-495.Google ScholarGoogle Scholar
  22. 22 HOFFMAN, L. J., AND MILLER, W.F. Getting a personal dossier from a statistical data bank. Datamation 16, 5 (May 1970), 74-75.Google ScholarGoogle Scholar
  23. 23 KAM, J. B., AND ULLMAN, J.D. A model of statistical databases and their security. ACM Trans. Database Syst. 2, 1 (March 1977), 1-10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24 KARPINSKI, R.H. Reply to Hoffman and Shaw. Datamation 16, I0 {Oct. 1970), 11.Google ScholarGoogle Scholar
  25. 25 NARGUNDKAR, M. S., AND SAVELAND, W. Random rounding to prevent statistical disclosure. Proc. Am. Stat. Assoc., Soc. Stat. Sect. (1972), 382-385.Google ScholarGoogle Scholar
  26. 26 NATIONAL BUREAU OF STANDARDS. Data encryption standard. PIPS PUB. 46, Washington, D.C., Jan. 1977.Google ScholarGoogle Scholar
  27. 27 REED, I.S. Information theory and privacy in data banks. Proc. AFIPS 1973, vol. 42, AFIPS Press, Arlington, Va., pp. 581-587.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. 28 REINS, S.B. Medians and database security. In Foundations of Secure Computation, R. A. DeMillo et al., Eds. Academic, New York, 1978.Google ScholarGoogle Scholar
  29. 29 SCHLORER, J. Identification and retrieval of personal records from a statistical data bank. Methods Inform. Med. 14, 1 (Jan. 1975), 7-13.Google ScholarGoogle ScholarCross RefCross Ref
  30. 30 SCHLORER, J. Confidentiality and security in statistical data banks. In Data Documentation: Some Principles and Applications in Science and Industry, W. Guas and R. Henzler, Eds. Proc. Workshop Data Documentation, 1975, Verl. Dok., Munchen, 1977, pp. 101-123.Google ScholarGoogle Scholar
  31. 31 SCHL6REI~, J. Disclosure from statistical databases: Quantitative aspects of trackers. Inst. Medizinische Statistik und Dokumentation, Univ. Giessen, Giessen, W. Germany, Mar. 1979. To appear in A CM Trans. Database Syst.Google ScholarGoogle Scholar
  32. 32 SCHL6RER, J. Security of statistical databases: Multidimensional transformation. Rep. TB- IMSD 2/78, Inst. Medizinische Statistik und Dokumentation, Univ. Giessen, Giessen, W. Germany, Mar. 1979.Google ScholarGoogle Scholar
  33. 33 SCHL6RER, J. Statistical database security: Some recent results. Inst. Medizinische Statistik und Dokumentation, Univ. Giessen, Giessen, W. Germany, 1979. Presented at Medical Informatics, Berlin, 1979.Google ScholarGoogle Scholar
  34. 34 SCHWARTZ, M. D., DENNING, D. E., AND DENNING, P.J. Securing data bases under linear queries. Proc. IFIP Congress 77, North-Holland, Amsterdam, 1977, pp. 395-398.Google ScholarGoogle Scholar
  35. 35 SCHWARTZ, M. D. Inference from statistical data bases. Ph.D. Dissertation, Dep. Computer Sciences, Purdue Univ., W. Lafayette, Ind., Aug. 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. 36 SCHWARTZ, M. D., DENNING, D. E., AND DENNING, P.j. Linear queries in statistical databases. ACM Trans. Database Syst. 4, 2 (June 1979), 156-167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. 37 Yu, C. T., AND CHIN, F.Y. A study on the protection of statistical data bases. ACM SIGMOD Int. Conf. Management of Data, 1977, pp. i69-181. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Secure statistical databases with random sample queries

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader