research-article

From average case complexity to improper learning complexity

Authors:
Amit Daniely

The Hebrew University, Jerusalem, Israel

The Hebrew University, Jerusalem, Israel
View Profile

,
Nati Linial

The Hebrew University, Jerusalem, Israel

The Hebrew University, Jerusalem, Israel
View Profile

,
Shai Shalev-Shwartz

The Hebrew University, Jerusalem, Israel

The Hebrew University, Jerusalem, Israel
View Profile

STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computingMay 2014Pages 441–448https://doi.org/10.1145/2591796.2591820

Published:31 May 2014Publication History

STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing

Pages 441–448

ABSTRACT

The basic problem in the PAC model of computational learning theory is to determine which hypothesis classes are effficiently learnable. There is presently a dearth of results showing hardness of learning problems. Moreover, the existing lower bounds fall short of the best known algorithms.

The biggest challenge in proving complexity results is to establish hardness of improper learning (a.k.a. representation independent learning). The difficulty in proving lower bounds for improper learning is that the standard reductions from NP-hard problems do not seem to apply in this context. There is essentially only one known approach to proving lower bounds on improper learning. It was initiated in [21] and relies on cryptographic assumptions.

We introduce a new technique for proving hardness of improper learning, based on reductions from problems that are hard on average. We put forward a (fairly strong) generalization of Feige's assumption [13] about the complexity of refuting random constraint satisfaction problems. Combining this assumption with our new technique yields far reaching implications. In particular,

• Learning DNF's is hard.

• Agnostically learning halfspaces with a constant approximation ratio is hard.

• Learning an intersection of ω(1) halfspaces is hard.

Supplemental Material

p441-sidebyside.mp4

mp4

192.5 MB

Download

References

M. Alekhnovich. More on average case vs approximation complexity. In Foundations of Computer Science, 2003. Proceedings. 44th Annual IEEE Symposium on, pages 298--307. IEEE, 2003. Google ScholarDigital Library
B. Applebaum, B. Barak, and D. Xiao. On basing lower-bounds for learning on worst-case assumptions. In Foundations of Computer Science, 2008. FOCS'08. IEEE 49th Annual IEEE Symposium on, pages 211--220. IEEE, 2008. Google ScholarDigital Library
S. Arora, L. Babai, J. Stern, and Z. Sweedyk. The hardness of approximate optima in lattices, codes, and systems of linear equations. In Foundations of Computer Science, 1993. Proceedings., 34th Annual Symposium on, pages 724--733. IEEE, 1993. Google ScholarDigital Library
B. Barak, G. Kindler, and D. Steurer. On the optimality of semidefinite relaxations for average-case and generalized constraint satisfaction. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science, pages 197--214. ACM, 2013. Google ScholarDigital Library
P. Beame, R. Karp, T. Pitassi, and M. Saks. On the complexity of unsatisfiability proofs for random k-cnf formulas. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 561--571. ACM, 1998. Google ScholarDigital Library
P. Beame and T. Pitassi. Simplified and improved resolution lower bounds. In Foundations of Computer Science, 1996. Proceedings., 37th Annual Symposium on, pages 274--282. IEEE, 1996. Google ScholarDigital Library
E. Ben-Sasson and A. Wigderson. Short proofs are narrow-resolution made simple. In Proceedings of the thirty-first annual ACM symposium on Theory of computing, pages 517--526. ACM, 1999. Google ScholarDigital Library
Q. Berthet and P. Rigollet. Computational lower bounds for sparse pca. In COLT, 2013.Google Scholar
C. M. Bishop. Neural networks for pattern recognition. Oxford university press, 1995. Google ScholarCross Ref
A. Blum, A. Kalai, and H. Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. Journal of the ACM (JACM), 50(4):506--519, 2003. Google ScholarDigital Library
A. Coja-Oghlan, C. Cooper, and A. Frieze. An efficient sparse regularity concept. SIAM Journal on Discrete Mathematics, 23(4):2000--2034, 2010. Google ScholarDigital Library
A. Daniely, N. Linial, and S. Shalev-Shwartz. More data speeds up training time in learning halfspaces over sparse vectors. In NIPS, 2013.Google Scholar
U. Feige. Relations between average case complexity and approximation complexity. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pages 534--543. ACM, 2002. Google ScholarDigital Library
U. Feige and E. Ofek. Easily refutable subformulas of large random 3cnf formulas. In Automata, languages and programming, pages 519--530. Springer, 2004.Google ScholarCross Ref
V. Feldman, P. Gopalan, S. Khot, and A. Ponnuswami. New results for learning noisy parities and halfspaces. In In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, 2006. Google ScholarDigital Library
A. Haken. The intractability of resolution. Theoretical Computer Science, 39:297--308, 1985.Google ScholarCross Ref
J. Håstad. Some optimal inapproximability results. Journal of the ACM (JACM), 48(4):798--859, 2001. Google ScholarDigital Library
S. Huang. Approximation resistance on satisfiable instances for predicates strictly dominating parity. 2012.Google Scholar
S. Huang. Approximation resistance on satisfiable instances for predicates with few accepting inputs. In STOC, 2013. Google ScholarDigital Library
A. T. Kalai, A. R. Klivans, Y. Mansour, and R. A. Servedio. Agnostically learning halfspaces. SIAM Journal on Computing, 37(6):1777--1805, 2008. Google ScholarDigital Library
M. Kearns and L. G. Valiant. Cryptographic limitations on learning Boolean formulae and finite automata. In STOC, pages 433--444, May 1989. Google ScholarDigital Library
M. Kharitonov. Cryptographic hardness of distribution-specific learning. In Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, pages 372--381. ACM, 1993. Google ScholarDigital Library
S. Khot. On the power of unique 2-prover 1-round games. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pages 767--775. ACM, 2002. Google ScholarDigital Library
S. Khot and R. Saket. On the hardness of learning intersections of two halfspaces. Journal of Computer and System Sciences, 77(1):129--141, 2011. Google ScholarDigital Library
A. R. Klivans and R. O'Donnell. Learning intersections and thresholds of halfspaces. In Foundations of Computer Science, 2002. Proceedings. The 43rd Annual IEEE Symposium on, pages 177--186. IEEE, 2002. Google ScholarDigital Library
A. R. Klivans and R. Servedio. Learning dnf in time 2^{O(n^1/3)}. In Proceedings of the thirty-third annual ACM symposium on Theory of computing, pages 258--265. ACM, 2001. Google ScholarDigital Library
A. R. Klivans and A. A. Sherstov. Cryptographic hardness for learning intersections of halfspaces. In FOCS, 2006. Google ScholarDigital Library
N. Linial, Y. Mansour, and N. Nisan. Constant depth circuits, Fourier transform, and learnability. In FOCS, pages 574--579, Oct. 1989. Google ScholarDigital Library
Y. Mansour. An o(n log log n) learning algorithm for dnf under the uniform distribution. Journal of Computer and System Sciences, 50(3):543--550, 1995. Google ScholarDigital Library
L. Pitt and L. Valiant. Computational limitations on learning from examples. Journal of the Association for Computing Machinery, 35(4):965--984, Oct. 1988. Google ScholarDigital Library
P. Raghavendra. Optimal algorithms and inapproximability results for every csp? In Proceedings of the 40th annual ACM symposium on Theory of computing, pages 245--254. ACM, 2008. Google ScholarDigital Library
F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386--407, 1958. (Reprinted in Neurocomputing (MIT Press, 1988).). Google ScholarDigital Library
R. Schapire. The strength of weak learnability. In FOCS, pages 28--33, Oct. 1989. Google ScholarDigital Library
L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134--1142, Nov. 1984. Google ScholarDigital Library
V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.Google ScholarDigital Library

Index Terms

From average case complexity to improper learning complexity
1. Theory of computation

Recommendations

Relations between Average-Case and Worst-Case Complexity

The consequences of the worst-case assumption NP=P are very well understood. On the other hand, we only know a few consequences of the analogous average-case assumption “NP is easy on average.” In this paper we establish several new results on the worst-...
Read More
Average case computational complexity theory
Read More
Structural Average Case Complexity

Levin introduced an average-case complexity measure, based on a notion of “polynomial on average,” and defined “average-case polynomial-time many-one reducibility” among randomized decision problems. We generalize his notions of average-case complexity ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing
May 2014
984 pages
ISBN:9781450327107
DOI:10.1145/2591796
Program Chair:
David Shmoys
Cornell University
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 May 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CSP problems
DNFs
average case complexity
halfspaces
hardness of improper learning
resolution lower bounds
Qualifiers
- research-article
Conference

Acceptance Rates
STOC '14 Paper Acceptance Rate91of319submissions,29%Overall Acceptance Rate1,469of4,586submissions,32%
More
Upcoming Conference
STOC '24

Sponsor:

sigact

56th Annual ACM Symposium on Theory of Computing (STOC 2024)

June 24 - 28, 2024

Vancouver , BC , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 33
  Total Citations
  View Citations
- 258
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

From average case complexity to improper learning complexity

STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Relations between Average-Case and Worst-Case Complexity

Average case computational complexity theory

Structural Average Case Complexity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

From average case complexity to improper learning complexity

STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Relations between Average-Case and Worst-Case Complexity

Average case computational complexity theory

Structural Average Case Complexity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media