skip to main content
research-article
Open access

A framework for adaptive differential privacy

Published: 29 August 2017 Publication History

Abstract

Differential privacy is a widely studied theory for analyzing sensitive data with a strong privacy guarantee—any change in an individual's data can have only a small statistical effect on the result—and a growing number of programming languages now support differentially private data analysis. A common shortcoming of these languages is poor support for adaptivity. In practice, a data analyst rarely wants to run just one function over a sensitive database, nor even a predetermined sequence of functions with fixed privacy parameters; rather, she wants to engage in an interaction where, at each step, both the choice of the next function and its privacy parameters are informed by the results of prior functions. Existing languages support this scenario using a simple composition theorem, which often gives rather loose bounds on the actual privacy cost of composite functions, substantially reducing how much computation can be performed within a given privacy budget. The theory of differential privacy includes other theorems with much better bounds, but these have not yet been incorporated into programming languages.
We propose a novel framework for adaptive composition that is elegant, practical, and implementable. It consists of a reformulation based on typed functional programming of privacy filters, together with a concrete realization of this framework in the design and implementation of a new language, called Adaptive Fuzz. Adaptive Fuzz transplants the core static type system of Fuzz to the adaptive setting by wrapping the Fuzz typechecker and runtime system in an outer adaptive layer, allowing Fuzz programs to be conveniently constructed and typechecked on the fly. We describe an interpreter for Adaptive Fuzz and report results from two case studies demonstrating its effectiveness for implementing common statistical algorithms over real data sets.

Supplementary Material

Auxiliary Archive (icfp17-main109-s.zip)
Contained here are the source files for the Adaptive Fuzz language along with some simple libraries, examples, and tests. Adaptive Fuzz requires OCaml and is currently only supported on Ubuntu 16.04.2 LTS (but may work in other environments). After installing the language, the user will be able to run tests and examples, including the case studies described in the paper.

References

[1]
Rakesh Agrawal and Ramakrishnan Srikant. 2000. Privacy-preserving Data Mining. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD ’00). ACM, New York, NY, USA, 439–450.
[2]
Marc Andrysco, David Kohlbrenner, Keaton Mowery, Ranjit Jhala, Sorin Lerner, and Hovav Shacham. 2015. On Subnormal Floating Point and Abnormal Timing. In Proceedings of the 2015 IEEE Symposium on Security and Privacy (SP ’15). IEEE Computer Society, Washington, DC, USA, 623–639.
[3]
Gilles Barthe, Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, Aaron Roth, and Pierre-Yves Strub. 2015. Higher-order approximate relational refinement types for mechanism design and differential privacy. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, 55–68.
[4]
Gilles Barthe, Boris Köpf, Federico Olmedo, and Santiago Zanella Béguelin. 2012. Probabilistic Relational Reasoning for Differential Privacy. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’12), Vol. 47. ACM, ACM, New York, NY, USA, 97–110.
[5]
Raef Bassily, Adam Smith, and Abhradeep Thakurta. 2014. Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014. IEEE Computer Society, 464–473.
[6]
Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. 2006a. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology-EUROCRYPT 2006. Springer, 486–503.
[7]
Cynthia Dwork and Jing Lei. 2009. Differential privacy and robust statistics. In Proceedings of the forty-first annual ACM symposium on Theory of computing. ACM, 371–380.
[8]
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006b. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography. Springer, 265–284.
[9]
Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science 9, 3-4 (2014), 211–407.
[10]
Cynthia Dwork, Guy N Rothblum, and Salil Vadhan. 2010. Boosting and differential privacy. In Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on. IEEE, 51–60.
[11]
Hamid Ebadi and David Sands. 2015. Featherweight PINQ. CoRR abs/1505.02642 (2015). http://arxiv .org/abs/1505.02642
[12]
Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and Johannes Gehrke. 2002. Privacy Preserving Mining of Association Rules. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’02). ACM, New York, NY, USA, 217–228.
[13]
Marco Gaboardi, Andreas Haeberlen, Justin Hsu, Arjun Narayan, and Benjamin C Pierce. 2013. Linear dependent types for differential privacy. In ACM SIGPLAN Notices, Vol. 48. ACM, 357–370.
[14]
Marco Gaboardi, James Honaker, Gary King, Kobbi Nissim, Jonathan Ullman, and Salil Vadhan. Working Paper. PSI (Îĺ): a Private data Sharing Interface. In Theory and Practice of Differential Privacy. New York, NY.
[15]
Srivatsava Ranjit Ganta, Shiva Prasad Kasiviswanathan, and Adam Smith. 2008. Composition Attacks and Auxiliary Information in Data Privacy. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’08). ACM, New York, NY, USA, 265–273.
[16]
Andreas Haeberlen, Benjamin C. Pierce, and Arjun Narayan. 2011. Differential Privacy Under Fire. In Proceedings of the 20th USENIX Conference on Security (SEC’11). USENIX Association, Berkeley, CA, USA, 33–33. http://dl .acm.org/ citation .cfm?id=2028067.2028100
[17]
J. Hsu, M. Gaboardi, A. Haeberlen, S. Khanna, A. Narayan, B. C. Pierce, and A. Roth. 2014. Differential Privacy: An Economic Method for Choosing Epsilon. In 2014 IEEE 27th Computer Security Foundations Symposium. 398–410.
[18]
Aaron Johnson and Vitaly Shmatikov. 2013. Privacy-preserving Data Exploration in Genome-wide Association Studies. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’13). ACM, New York, NY, USA, 1079–1087.
[19]
Peter Kairouz, Sewoong Oh, and Pramod Viswanath. 2015. The Composition Theorem for Differential Privacy. In Proceedings of The 32nd International Conference on Machine Learning. 1376–1385.
[20]
Shiva P Kasiviswanathan and Adam Smith. 2014. On the’Semantics’ of Differential Privacy: A Bayesian Formulation. Journal of Privacy and Confidentiality 6, 1 (2014), 1.
[21]
Daniel Kifer. 2009. Attacks on Privacy and deFinetti’s Theorem. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD ’09). ACM, New York, NY, USA, 127–138.
[22]
Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian. 2007. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In 2007 IEEE 23rd International Conference on Data Engineering. 106–115.
[23]
Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. 2006. ℓ-Diversity: Privacy Beyond κ-Anonymity. In Proceedings of the 22Nd International Conference on Data Engineering (ICDE ’06). IEEE Computer Society, Washington, DC, USA, 24–.
[24]
Frank McSherry and Ilya Mironov. 2009. Differentially Private Recommender Systems: Building Privacy into the Net. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’09). ACM, New York, NY, USA, 627–636.
[25]
Frank D McSherry. 2009. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. ACM, 19–30.
[26]
Elinor Mills. 2006. AOL sued over Web search data release. (Sept. 2006). cnet, http://www .cnet.com/news/aol-sued-overweb- search- data- release/ .
[27]
Ilya Mironov. 2012. On Significance of the Least Significant Bits for Differential Privacy. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS ’12). ACM, New York, NY, USA, 650–661.
[28]
Prashanth Mohan, Abhradeep Thakurta, Elaine Shi, Dawn Song, and David Culler. 2012. GUPT: Privacy Preserving Data Analysis Made Easy. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD ’12). ACM, New York, NY, USA, 349–360.
[29]
Jack Murtagh and Salil Vadhan. 2016. The Complexity of Computing the Optimal Composition of Differential Privacy. In Theory of Cryptography. Springer, 157–175.
[30]
Arjun Narayan and Andreas Haeberlen. 2012. DJoin: Differentially Private Join Queries over Distributed Databases. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). USENIX, Hollywood, CA, 149–162. https://www .usenix.org/conference/osdi12/technical-sessions/presentation/narayan
[31]
Arvind Narayanan and Vitaly Shmatikov. 2008. Robust De-anonymization of Large Sparse Datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy. 111–125.
[32]
Jason Reed and Benjamin C. Pierce. 2010. Distance Makes the Types Grow Stronger: A Calculus for Differential Privacy. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming (ICFP ’10). ACM, New York, NY, USA, 157–168.
[33]
Ryan M. Rogers, Aaron Roth, Jonathan Ullman, and Salil P. Vadhan. 2016. Privacy Odometers and Filters: Pay-as-you-Go Composition. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 1921–1929.
[34]
Indrajit Roy, Srinath T. V. Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. 2010. Airavat: Security and Privacy for MapReduce. In Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation (NSDI’10), Vol. 10. USENIX Association, Berkeley, CA, USA, 297–312. http://dl .acm.org/citation.cfm?id=1855711.1855731
[35]
Ryan Singel. 2009. Netflix Spilled Your “Brokeback Mountain” Secret, Lawsuit Claims. (Dec. 2009). Wired, http:// www .wired.com/2009/12/netflix-privacy-lawsuit/ .
[36]
Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10, 5 (Oct. 2002), 557–570.
[37]
Ryan J. Tibshirani. 2015. A General Framework for Fast Stagewise Algorithms. The Journal of Machine Learning Research 16, 1 (Jan. 2015), 2543–2588. http://dl .acm.org/citation.cfm?id=2789272.2912080
[38]
UCI KDD Archive. 2016. 1990 US Census Data Set. (2016). Available from https://kdd .ics.uci.edu/databases/census1990/ USCensus1990raw .html .
[39]
Philip Wadler. 1990. Linear Types Can Change the World. In IFIP TC 2 Working Conference on Programming Concepts and Methods, Sea of Galilee, Israel. 546–566.

Cited By

View all
  • (2024)Sensitivity by ParametricityProceedings of the ACM on Programming Languages10.1145/36897268:OOPSLA2(415-441)Online publication date: 8-Oct-2024
  • (2024)Program Analysis for Adaptive Data AnalysisProceedings of the ACM on Programming Languages10.1145/36564148:PLDI(914-938)Online publication date: 20-Jun-2024
  • (2024)Investigating the Visual Utility of Differentially Private ScatterplotsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.329239130:8(5370-5385)Online publication date: Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 1, Issue ICFP
September 2017
1173 pages
EISSN:2475-1421
DOI:10.1145/3136534
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 August 2017
Published in PACMPL Volume 1, Issue ICFP

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Adaptivity
  2. Case Study
  3. Differential Privacy
  4. Fuzz
  5. Privacy Filter

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)265
  • Downloads (Last 6 weeks)40
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Sensitivity by ParametricityProceedings of the ACM on Programming Languages10.1145/36897268:OOPSLA2(415-441)Online publication date: 8-Oct-2024
  • (2024)Program Analysis for Adaptive Data AnalysisProceedings of the ACM on Programming Languages10.1145/36564148:PLDI(914-938)Online publication date: 20-Jun-2024
  • (2024)Investigating the Visual Utility of Differentially Private ScatterplotsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.329239130:8(5370-5385)Online publication date: Aug-2024
  • (2023)Calculating Function Sensitivity for Synthetic Data AlgorithmsProceedings of the 35th Symposium on Implementation and Application of Functional Languages10.1145/3652561.3652567(1-12)Online publication date: 29-Aug-2023
  • (2023)Model checking differentially private propertiesTheoretical Computer Science10.1016/j.tcs.2022.10.002943(153-170)Online publication date: Jan-2023
  • (2023)A Survey on Privacy Preserving Synthetic Data Generation and a Discussion on a Privacy-Utility Trade-off ProblemScience of Cyber Security - SciSec 2022 Workshops10.1007/978-981-19-7769-5_13(167-180)Online publication date: 1-Jan-2023
  • (2023)Bunched Fuzz: Sensitivity for Vector MetricsProgramming Languages and Systems10.1007/978-3-031-30044-8_17(451-478)Online publication date: 22-Apr-2023
  • (2022)Solo: a lightweight static analysis for differential privacyProceedings of the ACM on Programming Languages10.1145/35633136:OOPSLA2(699-728)Online publication date: 31-Oct-2022
  • (2022)Safe couplings: coupled refinement typesProceedings of the ACM on Programming Languages10.1145/35476436:ICFP(596-624)Online publication date: 31-Aug-2022
  • (2021)A pre-expectation calculus for probabilistic sensitivityProceedings of the ACM on Programming Languages10.1145/34343335:POPL(1-28)Online publication date: 4-Jan-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media