skip to main content
10.1145/2688073.2688106acmconferencesArticle/Chapter ViewAbstractPublication PagesitcsConference Proceedingsconference-collections
research-article

Accuracy for Sale: Aggregating Data with a Variance Constraint

Published: 11 January 2015 Publication History

Abstract

We consider the problem of a data analyst who may purchase an unbiased estimate of some statistic from multiple data providers. From each provider i, the analyst has a choice: she may purchase an estimate from that provider that has variance chosen from a finite menu of options. Each level of variance has a cost associated with it, reported (possibly strategically) by the data provider. The analyst wants to choose the minimum cost set of variance levels, one from each provider, that will let her combine her purchased estimators into an aggregate estimator that has variance at most some fixed desired level. Moreover, she wants to do so in such a way that incentivizes the data providers to truthfully report their costs to the mechanism. We give a dominant strategy truthful solution to this problem that yields an estimator that has optimal expected cost, and violates the variance constraint by at most an additive term that tends to zero as the number of data providers grows large.

References

[1]
Aaron Archer, Christos Papadimitriou, Kunal Talwar, and Éva Tardos. An approximate truthful mechanism for combinatorial auctions with single parameter agents. In Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '03, pages 205--214, 2003.
[2]
Anthony Atkinson, Alexander Donev, and Randall Tobias. Optimum Experimental Designs, with SAS. Oxford Statistical Science, 2007.
[3]
Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
[4]
Shahar Dobzinski and Shaddin Dughmi. On the power of randomization in algorithmic mechanism design. In Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science, FOCS '09, pages 505--514, 2009.
[5]
Shahar Dobzinski and Noam Nisan. Limitations of VCG-based mechanisms. In Proceedings of the 39th Annual ACM Symposium on Theory of Computing, STOC '07, pages 338--344, 2007.
[6]
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Proceedings of the 3rd Conference on Theory of Cryptography, TCC'06, pages 265--284, 2006.
[7]
Lisa K. Fleischer and Yu-Han Lyu. Approximately optimal auctions for selling privacy when costs are correlated with data. In Proceedings of the 13th ACM Conference on Electronic Commerce, EC '12, pages 568--585, 2012.
[8]
Arpita Ghosh and Katrina Ligett. Privacy and coordination: computing on databases with endogenous participation. In Proceedings of the 14th ACM conference on Electronic Commerce, pages 543--560. ACM, 2013.
[9]
Arpita Ghosh, Katrina Ligett, Aaron Roth, and Grant Schoenebeck. Buying private data without verification. In Proceedings of the 15th ACM Conference on Economics and Computation, EC '14, pages 931--948, 2014.
[10]
Arpita Ghosh and Aaron Roth. Selling privacy at auction. In Proceedings of the 12th ACM Conference on Electronic Commerce, EC '11, pages 199--208, 2011.
[11]
Jerry R. Green and Jean-Jacques Laffont. Characterization of satisfactory mechanisms for the revelation of preferences for public goods. Econometrica, 45(2):427--438, 1977.
[12]
Bengt Holmström. Groves' scheme on restricted domains. Econometrica, 47(5):1137--1144, 1979.
[13]
Thibaut Horel, Stratis Ioannidis, and S. Muthukrishnan. Budget feasible mechanisms for experimental design. In Alberto Pardo and Alfredo Viola, editors, LATIN 2014: Theoretical Informatics, Lecture Notes in Computer Science, pages 719--730. 2014.
[14]
Ron Lavi and Chaitanya Swamy. Truthful and near-optimal mechanism design via linear programming. J. ACM, 58(6):1--24, December 2011.
[15]
Katrina Ligett and Aaron Roth. Take it or leave it: Running a survey when privacy comes at a cost. In Paul W. Goldberg, editor, Internet and Network Economics, volume 7695 of Lecture Notes in Computer Science, pages 378--391. 2012.
[16]
George L. Nemhauser and Laurence A. Wolsey. Integer and combinatorial optimization. Wiley interscience series in discrete mathematics and optimization. Wiley, 1988.
[17]
Kobbi Nissim, Salil Vadhan, and David Xiao. Redrawing the boundaries on purchasing data from privacy-sensitive individuals. In Proceedings of the 5th Conference on Innovations in Theoretical Computer Science, ITCS '14, pages 411--422, 2014.
[18]
Friedrich Pukelsheim. Optimal Design of Experiments, volume 50. Society for Industrial and Applied Mathematics, 2006.
[19]
Aaron Roth and Grant Schoenebeck. Conducting truthful surveys, cheaply. In Proceedings of the 13th ACM Conference on Electronic Commerce, EC '12, pages 826--843, 2012.

Cited By

View all
  • (2024)On Three-Layer Data MarketsSSRN Electronic Journal10.2139/ssrn.4727167Online publication date: 2024
  • (2024)Data Exchange Markets via Utility BalancingProceedings of the ACM Web Conference 202410.1145/3589334.3645364(57-65)Online publication date: 13-May-2024
  • (2024)When Crowdsourcing Meets Data Markets: A Fair Data Value Metric for Data TradingJournal of Computer Science and Technology10.1007/s11390-023-2519-039:3(671-690)Online publication date: 1-May-2024
  • Show More Cited By

Index Terms

  1. Accuracy for Sale: Aggregating Data with a Variance Constraint

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ITCS '15: Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science
    January 2015
    404 pages
    ISBN:9781450333337
    DOI:10.1145/2688073
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 January 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. buying data
    2. mechanism design
    3. vcg mechanism

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ITCS'15
    Sponsor:
    ITCS'15: Innovations in Theoretical Computer Science
    January 11 - 13, 2015
    Rehovot, Israel

    Acceptance Rates

    ITCS '15 Paper Acceptance Rate 45 of 159 submissions, 28%;
    Overall Acceptance Rate 172 of 513 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)23
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)On Three-Layer Data MarketsSSRN Electronic Journal10.2139/ssrn.4727167Online publication date: 2024
    • (2024)Data Exchange Markets via Utility BalancingProceedings of the ACM Web Conference 202410.1145/3589334.3645364(57-65)Online publication date: 13-May-2024
    • (2024)When Crowdsourcing Meets Data Markets: A Fair Data Value Metric for Data TradingJournal of Computer Science and Technology10.1007/s11390-023-2519-039:3(671-690)Online publication date: 1-May-2024
    • (2023)How Good are Privacy Guarantees? Platform Architecture and Violation of User PrivacySSRN Electronic Journal10.2139/ssrn.4497994Online publication date: 2023
    • (2023)How Good Are Privacy Guarantees? Data Sharing, Privacy Preservation, and Platform BehaviorSSRN Electronic Journal10.2139/ssrn.4333457Online publication date: 2023
    • (2023)A Survey of Data Pricing for Data MarketplacesIEEE Transactions on Big Data10.1109/TBDATA.2023.32541529:4(1038-1056)Online publication date: 1-Aug-2023
    • (2023)Optimal Data Acquisition with Privacy-Aware Agents2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)10.1109/SaTML54575.2023.00023(210-224)Online publication date: Feb-2023
    • (2023)RSS Localization for Large-Scale DeploymentWireless Localization Techniques10.1007/978-3-031-21178-2_4(155-267)Online publication date: 11-Jan-2023
    • (2022)A Survey on Data Pricing: From Economics to Data ScienceIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.304592734:10(4586-4608)Online publication date: 1-Oct-2022
    • (2022)Mechanism design for data sharing: An electricity retail perspectiveApplied Energy10.1016/j.apenergy.2022.118871314(118871)Online publication date: May-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media