skip to main content
10.1145/3055399.3055406acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
research-article
Public Access

The limitations of optimization from samples

Published: 19 June 2017 Publication History

Abstract

In this paper we consider the following question: can we optimize objective functions from the training data we use to learn them? We formalize this question through a novel framework we call optimization from samples (OPS). In OPS, we are given sampled values of a function drawn from some distribution and the objective is to optimize the function under some constraint.
While there are interesting classes of functions that can be optimized from samples, our main result is an impossibility. We show that there are classes of functions which are statistically learnable and optimizable, but for which no reasonable approximation for optimization from samples is achievable. In particular, our main result shows that there is no constant factor approximation for maximizing coverage functions under a cardinality constraint using polynomially-many samples drawn from any distribution.
We also show tight approximation guarantees for maximization under a cardinality constraint of several interesting classes of functions including unit-demand, additive, and general monotone submodular functions, as well as a constant factor approximation for monotone submodular functions with bounded curvature.

Supplementary Material

MP4 File (d4_sb_t1.mp4)

References

[1]
Ioannis Antonellis, Anish Das Sarma, and Shaddin Dughmi. Dynamic covering for recommendation systems. In Proceedings of the 21st ACM international conference on Information and knowledge management, 2012.
[2]
Ashwinkumar Badanidiyuru, Shahar Dobzinski, Hu Fu, Robert Kleinberg, Noam Nisan, and Tim Roughgarden. Sketching valuation functions. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, 2012.
[3]
Maria-Florina Balcan. Learning submodular functions with applications to multiagent systems. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015.
[4]
Maria-Florina Balcan and Nicholas J. A. Harvey. Learning submodular functions. In Proceedings of the 43rd ACM Symposium on Theory of Computing, 2011.
[5]
Maria-Florina Balcan, Florin Constantin, Satoru Iwata, and Lei Wang. Learning valuation functions. In The 25th Annual Conference on Learning Theory, 2012.
[6]
Eric Balkanski, Aviad Rubinstein, and Yaron Singer. The power of optimization from samples. In NIPS, 2016.
[7]
Christian Borgs, Michael Brautbar, Jennifer Chayes, and Brendan Lucier. Maximizing social influence in nearly optimal time. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, 2014.
[8]
Dave Buchfuhrer, Michael Schapira, and Yaron Singer. Computation and incentives in combinatorial public projects. In Proceedings of the 11th ACM conference on Electronic commerce, 2010.
[9]
Shuchi Chawla, Jason D. Hartline, and Denis Nekipelov. Mechanism design for data science. In ACM Conference on Economics and Computation, EC, 2014.
[10]
Yu Cheng, Ho Yee Cheung, Shaddin Dughmi, Ehsan Emamjomeh-Zadeh, Li Han, and Shang-Hua Teng. Mixture selection, mechanism design, and signaling. In FOCS, 2015.
[11]
Flavio Chierichetti, Ravi Kumar, and Andrew Tomkins. Max-cover in map-reduce. In Proceedings of the 19th international conference on World wide web, 2010.
[12]
Richard Cole and Tim Roughgarden. The sample complexity of revenue maximization. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing, 2014.
[13]
Hadi Daneshmand, Manuel Gomez-Rodriguez, Le Song, and Bernhard Schölkopf. Estimating diffusion network structures: Recovery conditions, sample complexity & soft-thresholding algorithm. In Proceedings of the 31th International Conference on Machine Learning, 2014.
[14]
Anirban Dasgupta, Arpita Ghosh, Ravi Kumar, Christopher Olston, Sandeep Pandey, and Andrew Tomkins. The discoverability of the web. In Proceedings of the 16th international conference on World Wide Web, 2007.
[15]
Shahar Dobzinski and Michael Schapira. An improved approximation algorithm for combinatorial auctions with submodular bidders. In Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, 2006.
[16]
Nan Du, Le Song, Manuel Gomez-Rodriguez, and Hongyuan Zha. Scalable influence estimation in continuous-time diffusion networks. In Advances in Neural Information Processing Systems, 2013.
[17]
Nan Du, Yingyu Liang, Maria-Florina Balcan, and Le Song. Learning time-varying coverage functions. In Advances in Neural Information Processing Systems, 2014.
[18]
STOC’17, June 2017, Montreal, Canada Eric Balkanski, Aviad Rubinstein, and Yaron Singer
[19]
Nan Du, Yingyu Liang, Maria-Florina Balcan, and Le Song. Influence function learning in information diffusion networks. In Proceedings of the 31th International Conference on Machine Learning, 2014.
[20]
Nan Du, Yingyu Liang, Maria-Florina F Balcan, and Le Song. Learning timevarying coverage functions. In Advances in neural information processing systems, 2014.
[21]
Shaddin Dughmi. A truthful randomized mechanism for combinatorial public projects via convex optimization. In Proceedings of the 12th ACM conference on Electronic commerce, 2011.
[22]
Shaddin Dughmi and Jan Vondrák. Limitations of randomized mechanisms for combinatorial auctions. Games and Economic Behavior, 2015.
[23]
Shaddin Dughmi, Tim Roughgarden, and Qiqi Yan. From convex optimization to randomized mechanisms: toward optimal combinatorial auctions. In Proceedings of the forty-third annual ACM symposium on Theory of computing, 2011.
[24]
Shaddin Dughmi, Li Han, and Noam Nisan. Sampling and representation complexity of revenue maximization. In Web and Internet Economics - 10th International Conference, WINE, 2014.
[25]
Alina Ene, Jan Vondrák, and Yi Wu. Local distribution and the symmetry gap: Approximability of multiway partitioning problems. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, 2013.
[26]
Uriel Feige. A threshold of ln n for approximating set cover. Journal of the ACM ( JACM), 45, 1998.
[27]
Vitaly Feldman and Pravesh Kothari. Learning coverage functions and private release of marginals. In Proceedings of The 27th Conference on Learning Theory, 2014.
[28]
Vitaly Feldman and Jan Vondrák. Optimal bounds on approximation of submodular and XOS functions by juntas. In 54th Annual IEEE Symposium on Foundations of Computer Science, 2013.
[29]
Vitaly Feldman and Jan Vondrák. Tight bounds on low-degree spectral concentration of submodular and xos functions. In Foundations of Computer Science, 2015.
[30]
Vitaly Feldman, Pravesh Kothari, and Jan Vondrák. Representation, approximation and learning of submodular functions using low-rank decision trees. In The 26th Annual Conference on Learning Theory, 2013.
[31]
Michel X Goemans, Nicholas JA Harvey, Satoru Iwata, and Vahab Mirrokni. Approximating submodular functions everywhere. In Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, 2009.
[32]
Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Krause. Inferring networks of diffusion and influence. In Proceedings of the 16th International Conference on Knowledge Discovery and Data Mining, 2010.
[33]
Manuel Gomez Rodriguez, Jure Leskovec, and Andreas Krause. Inferring networks of diffusion and influence. In Proceedings of the 16th international conference on Knowledge discovery and data mining, 2010.
[34]
Carlos Guestrin, Andreas Krause, and Ajit Paul Singh. Near-optimal sensor placements in gaussian processes. In Proceedings of the 22nd international conference on Machine learning, 2005.
[35]
Anupam Gupta, Moritz Hardt, Aaron Roth, and Jonathan Ullman. Privately releasing conjunctions and the statistical query barrier. SIAM Journal on Computing, 42, 2013.
[36]
Avinatan Hassidim and Yaron Singer. Submodular optimization under noise. 2015. Working paper.
[37]
Elad Hazan. Draft: Introduction to online convex optimization. In Foundations and Trends in Optimization, vol. XX, no. XX. 2016.
[38]
Xinran He and David Kempe. Robust influence maximization. In Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining, 2016.
[39]
Zhiyi Huang, Yishay Mansour, and Tim Roughgarden. Making the most of your samples. In Proceedings of the Sixteenth ACM Conference on Economics and Computation, EC, 2015.
[40]
David Kempe, Jon Kleinberg, and Éva Tardos. Maximizing the spread of influence through a social network. In Proceedings of the ninth international conference on Knowledge discovery and data mining, 2003.
[41]
Nitish Korula, Vahab S. Mirrokni, and Morteza Zadimoghaddam. Online submodular welfare maximization: Greedy beats 1/2 in random order. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, 2015.
[42]
Andreas Krause and Carlos Guestrin. Near-optimal observation selection using submodular functions. In AAAI, 2007.
[43]
Benny Lehmann, Daniel Lehmann, and Noam Nisan. Combinatorial auctions with decreasing marginal utilities. In Proceedings of the 3rd ACM conference on Electronic Commerce, 2001.
[44]
Hui Lin and Jeff Bilmes. A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011.
[45]
Vahab Mirrokni, Michael Schapira, and Jan Vondrák. Tight information-theoretic lower bounds for welfare maximization in combinatorial auctions. In Proceedings of the 9th ACM conference on Electronic commerce, 2008.
[46]
Jamie Morgenstern and Tim Roughgarden. The pseudo-dimension of nearlyoptimal auctions. In NIPS, 2015.
[47]
Elchanan Mossel, Ryan O’Donnell, and Rocco P Servedio. Learning juntas. In Proceedings of the thirty-fifth symposium on Theory of computing, 2003.
[48]
Harikrishna Narasimhan, David C. Parkes, and Yaron Singer. Learnability of influence in networks. In Advances in Neural Information Processing Systems, 2015.
[49]
G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for maximizing submodular set functions ii. Math. Programming Study 8, 1978.
[50]
Christos H Papadimitriou and John N Tsitsiklis. The complexity of markov decision processes. Mathematics of operations research, 12, 1987.
[51]
Barna Saha and Lise Getoor. On maximum coverage in the streaming model & application to multi-topic blog-watch. In SDM, volume 9, 2009.
[52]
Lior Seeman and Yaron Singer. Adaptive seeding in social networks. In Foundations of Computer Science (FOCS), 2013.
[53]
Yaron Singer. How to win friends and influence people, truthfully: influence maximization mechanisms for social networks. In Proceedings of the fifth ACM international conference on Web search and data mining, 2012.
[54]
Ashwin Swaminathan, Cherian V Mathew, and Darko Kirovski. Essential pages. In Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, 2009.
[55]
Hiroya Takamura and Manabu Okumura. Text summarization model based on maximum coverage problem and its variant. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, 2009.
[56]
Gregory Valiant. Finding correlations in subquadratic time, with applications to learning parities and juntas. In Foundations of Computer Science, 2012.
[57]
Leslie G. Valiant. A Theory of the Learnable. Commun. ACM, 1984.
[58]
Jan Vondrák. Symmetry and approximability of submodular maximization problems. SIAM J. Comput., 42, 2013.
[59]
Yisong Yue and Thorsten Joachims. Predicting diverse subsets using structural svms. In Proceedings of the 25th international conference on Machine learning, 2008.

Cited By

View all

Index Terms

  1. The limitations of optimization from samples

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    STOC 2017: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing
    June 2017
    1268 pages
    ISBN:9781450345286
    DOI:10.1145/3055399
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 June 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Optimization
    2. PAC learning
    3. coverage functions

    Qualifiers

    • Research-article

    Funding Sources

    • Microsoft Research PhD Fellowship
    • NSF
    • Smith Family Graduate Science and Engineering Fellowship
    • NSF CAREER
    • Templeton Foundation grant
    • NSF grant

    Conference

    STOC '17
    Sponsor:
    STOC '17: Symposium on Theory of Computing
    June 19 - 23, 2017
    Montreal, Canada

    Acceptance Rates

    Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

    Upcoming Conference

    STOC '25
    57th Annual ACM Symposium on Theory of Computing (STOC 2025)
    June 23 - 27, 2025
    Prague , Czech Republic

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)166
    • Downloads (Last 6 weeks)24
    Reflects downloads up to 07 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A Learning Framework for Distribution-Based Game-Theoretic Solution ConceptsACM Transactions on Economics and Computation10.1145/358037411:1-2(1-23)Online publication date: 24-Jun-2023
    • (2023)Limits of OptimizationMinds and Machines10.1007/s11023-023-09633-134:S1(117-137)Online publication date: 6-Apr-2023
    • (2022)Smart “Predict, then Optimize”Management Science10.1287/mnsc.2020.392268:1(9-26)Online publication date: 1-Jan-2022
    • (2022)On the complexity of dynamic submodular maximizationProceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing10.1145/3519935.3519951(1685-1698)Online publication date: 9-Jun-2022
    • (2022)The Limitations of Optimization from SamplesJournal of the ACM10.1145/3511018Online publication date: 11-May-2022
    • (2021)Competitive Caching with Machine Learned AdviceJournal of the ACM10.1145/344757968:4(1-25)Online publication date: 14-Jul-2021
    • (2021)Learning diffusion model-free and efficient influence function for influence maximization from information cascadesKnowledge and Information Systems10.1007/s10115-021-01556-6Online publication date: 19-Mar-2021
    • (2020)Optimization from structured samples for coverage functionsProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525098(1715-1724)Online publication date: 13-Jul-2020
    • (2020)Customizing ML predictions for online algorithmsProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3524967(303-313)Online publication date: 13-Jul-2020
    • (2019)Locally private learning without interaction requires separationProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3455630(15001-15012)Online publication date: 8-Dec-2019
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media