Article

Free Access

Rigorous learning curve bounds from statistical mechanics

Authors:
David Haussler

U.C. Santa Cruz, Santa Cruz, California

U.C. Santa Cruz, Santa Cruz, California
View Profile

,
H. Sebastian Seung

AT&T Bell Laboratories, Murray Hill, New Jersey

AT&T Bell Laboratories, Murray Hill, New Jersey
View Profile

,
Michael Kearns

AT&T Bell Laboratories, Murray Hill, New Jersey

AT&T Bell Laboratories, Murray Hill, New Jersey
View Profile

,
Naftali Tishby

Hebrew University, Jerusalem, Israel

Hebrew University, Jerusalem, Israel
View Profile

COLT '94: Proceedings of the seventh annual conference on Computational learning theoryJuly 1994Pages 76–87https://doi.org/10.1145/180139.181018

Published:16 July 1994Publication History

COLT '94: Proceedings of the seventh annual conference on Computational learning theory

Pages 76–87

ABSTRACT

In this paper we introduce and investigate a mathematically rigorous theory of learning curves that is based on ideas from statistical mechanics. The advantage of our theory over the well-established Vapnik-Chervonenkis theory is that our bounds can be considerably tighter in many cases, and are also more reflective of the true behavior (functional form) of learning curves. This behavior can often exhibit dramatic properties such as phase transitions, as well as power law asymptotics not explained by the VC theory. The disadvantages of our theory are that its application requires knowledge of the input distribution, and it is limited so far to finite cardinality function classes. We illustrate our results with many concrete examples of learning curve bounds derived from our theory.

References

1.S. Amari, N. Fujita, and S. Shinomoto. Four types of learning curves. Neural Computation, 4(4):605-618, 1992.]] Google ScholarDigital Library
2.E.B. Baum and Y.-D. Lyuu. The transition to perfect generalization in perceptrons. Neural Comput., 3:386-401, 1991.]]Google ScholarCross Ref
3.G. Benedek and A. Itai. Learnability with respect to fixed distributions. Theoret. Comput. Sc~., 86(2):377-389, 1991.]] Google ScholarDigital Library
4.D. Cohn and G. Tesauro. How tight are the Vapnik- Chervonenkis bounds. Neural Comput., 4:249-269, 1992.]] Google ScholarDigital Library
5.L. Devroye and G. Lugosi. Lower bounds in pattern recognition and learning. 1994. Preprint.]]Google Scholar
6.R. M. Dudley. Centrallimit theorems for emplricalmeasures. Annals of Probability, 6(6):899-929, 1978.]]Google ScholarCross Ref
7.A. Ehrenfeucht, D. Haussler, M. Kearns, and L. Valiant. A general lower bound on the number of examples needed for learning. Information and Computatwn, 82(3):247-251~ 1989.]] Google ScholarDigital Library
8.A. Engel and W. Fink. Statistical mechanics calculaOon of Vapnik Chervonenkis bounds for perceptrons. J. Phys., 26:6893-6914, 1993.]]Google Scholar
9.A. Engel and C. van den Broeck. Systems that can learn from examples: replica calculation of uniform convergence bounds for the perceptron. Phys. Rev. Lett., 71:1772-1775, 1993.]]Google ScholarCross Ref
10.E. Gardner. The space of interactions in neural network models. J. Phys., A21:257-270, 1988.]]Google Scholar
11.E. Gardner and B. Derrida. Three unfinished works on the optimal storage capacity of networks. J. Phys., A22:1983- 1994, 1989.]]Google Scholar
12.S. A. Goldman, M. J. Kearns, and R. E. Schapire. On the sample complexity of weak learning. In Proceedings of the 3rd Workshop on Computational Learning Theory, pages 217-231. Morgan Kaufmann, San Mateo, CA, 1990.]] Google ScholarDigital Library
13.G. GySrgyi. First-order transition to perfect generalization in a neural network with binary synapses. Phys. Ray., A41:7097-7100, 1990.]]Google Scholar
14.D. Haussler. Decision-theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1):78-150~ 1992.]] Google ScholarDigital Library
15.D. Haussler, M. Kearns, and R. E. Schapire. Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension. In Proceedings of the gth Workshop on Computational Learning Theory, pages 61-74. Morgan Kaufmann, San Mateo, CA, 1991.]] Google ScholarDigital Library
16.E. Levin, N. Tishby, and S. Solla. A statistical approach to learning and generalization in neural networks. In R. Rivest, editor, Proc. 3rd Annu. Workshop on Comput. Learning Theory. Morgan Kanfmann, 1989.]] Google ScholarDigital Library
17.Y.-D. Lyuu and I. Rivin. Tight bounds on transition to perfect generalization in perceptrons. Neural Comput., 4:854- 862, 1992.]] Google ScholarDigital Library
18.G. L. Martin and J. A. Pittman. Recognizing hand-printed letters and digits using backpropagation learning. Neural Comput., 3:258-267, 1991.]] Google ScholarCross Ref
19.D. Pollard. Convergence of Stochastic Processes. Springer- Verlag, 1984.]]Google ScholarCross Ref
20.D. B. Schwartz, V. K. Samalam, J. S. Denker, and S. A. Solla. Exhaustive learning. Neural Comput., 2:374-385, 1990.]]Google ScholarDigital Library
21.H. S. Setmg, H. Sompolinsky, and N. Tishby. Statistical mechanics of learning from examples. Physical Review, A45:6056-6091, 1992.]]Google Scholar
22.H. U. Simon. General bounds on the number of examples needed for learning probabilistic concepts. In Proceedings of the 6th Annual ACM Conference on Computatwnal Learning Theory, pages 402-411. ACM Press, New York, NY, 1993.]] Google ScholarDigital Library
23.H. Sompolinsky, H. S. Seung, and N. Tishby. Learning curves in large neural networks. In Proc..ith Annu. Workshop on Comput. Learning Theory, pages 112-127. Morgan Kaufmann, San Mateo, CA, 1991.]] Google ScholarDigital Library
24.H. Sompolinsky, N. Tishby, and H. S. Seung. Learning from examples in large neural networks. Phys. Rev. Lett., 65(13):1683-1686, 1990.]]Google ScholarCross Ref
25.V. Vapnik, E. Levin, and Y. LeCun. Measuring the VC dimension of a learning machine. Neural Comput., 1994. To appear.]] Google ScholarDigital Library
26.V. N. Vapnik. Estimation of Dependences Based on Empirical Data. Springer-Verlag, New York, 1982.]] Google ScholarDigital Library
27.V. N. Vapnik and A. Y. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applicatwns, 16(2):264- 280, 1971.]]Google ScholarCross Ref
28.T. L. H. Watkin, A. Rau, and M. Biehl. The statistical mechanics of learning a rule. Rev. Mod. Phys., 65:499-556, 1993.]]Google ScholarCross Ref

Index Terms

Rigorous learning curve bounds from statistical mechanics

Recommendations

Rigorous learning curve bounds from statistical mechanics
Special issue on COLT '94
Read More
New Lower Bounds for Statistical Query Learning
COLT '02: Proceedings of the 15th Annual Conference on Computational Learning Theory

We prove two lower bounds on the Statistical Query (SQ) learning model. The first lower bound is on weak-learning. We prove that for a concept class of SQ-dimension d , a running time of ( d / log d ) is needed. The SQ-dimension of a concept class is ...
Read More
New lower bounds for statistical query learning
Special issue on COLT 2002

We prove two lower bounds in the statistical query (SQ) learning model. The first lower bound is on weak-learning. We prove that for a concept class of SQ-dimension d, a running time of @W(d/logd) is needed. The SQ-dimension of a concept class is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
COLT '94: Proceedings of the seventh annual conference on Computational learning theory
July 1994
376 pages
ISBN:0897916557
DOI:10.1145/180139
Chairman:
Manfred Warmuth
Univ. of California, Santa Cruz
Copyright © 1994 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 July 1994
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate35of71submissions,49%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 26
  Total Citations
  View Citations
- 122
  Total Downloads
- Downloads (Last 12 months)64
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Rigorous learning curve bounds from statistical mechanics

COLT '94: Proceedings of the seventh annual conference on Computational learning theory

ABSTRACT

References

Cited By

Index Terms

Recommendations

Rigorous learning curve bounds from statistical mechanics

New Lower Bounds for Statistical Query Learning

New lower bounds for statistical query learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Rigorous learning curve bounds from statistical mechanics

COLT '94: Proceedings of the seventh annual conference on Computational learning theory

ABSTRACT

References

Cited By

Index Terms

Recommendations

Rigorous learning curve bounds from statistical mechanics

New Lower Bounds for Statistical Query Learning

New lower bounds for statistical query learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media