skip to main content
10.1145/180139.181018acmconferencesArticle/Chapter ViewAbstractPublication PagescoltConference Proceedingsconference-collections
Article
Free Access

Rigorous learning curve bounds from statistical mechanics

Published:16 July 1994Publication History

ABSTRACT

In this paper we introduce and investigate a mathematically rigorous theory of learning curves that is based on ideas from statistical mechanics. The advantage of our theory over the well-established Vapnik-Chervonenkis theory is that our bounds can be considerably tighter in many cases, and are also more reflective of the true behavior (functional form) of learning curves. This behavior can often exhibit dramatic properties such as phase transitions, as well as power law asymptotics not explained by the VC theory. The disadvantages of our theory are that its application requires knowledge of the input distribution, and it is limited so far to finite cardinality function classes. We illustrate our results with many concrete examples of learning curve bounds derived from our theory.

References

  1. 1.S. Amari, N. Fujita, and S. Shinomoto. Four types of learning curves. Neural Computation, 4(4):605-618, 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.E.B. Baum and Y.-D. Lyuu. The transition to perfect generalization in perceptrons. Neural Comput., 3:386-401, 1991.]]Google ScholarGoogle ScholarCross RefCross Ref
  3. 3.G. Benedek and A. Itai. Learnability with respect to fixed distributions. Theoret. Comput. Sc~., 86(2):377-389, 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.D. Cohn and G. Tesauro. How tight are the Vapnik- Chervonenkis bounds. Neural Comput., 4:249-269, 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.L. Devroye and G. Lugosi. Lower bounds in pattern recognition and learning. 1994. Preprint.]]Google ScholarGoogle Scholar
  6. 6.R. M. Dudley. Centrallimit theorems for emplricalmeasures. Annals of Probability, 6(6):899-929, 1978.]]Google ScholarGoogle ScholarCross RefCross Ref
  7. 7.A. Ehrenfeucht, D. Haussler, M. Kearns, and L. Valiant. A general lower bound on the number of examples needed for learning. Information and Computatwn, 82(3):247-251~ 1989.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.A. Engel and W. Fink. Statistical mechanics calculaOon of Vapnik Chervonenkis bounds for perceptrons. J. Phys., 26:6893-6914, 1993.]]Google ScholarGoogle Scholar
  9. 9.A. Engel and C. van den Broeck. Systems that can learn from examples: replica calculation of uniform convergence bounds for the perceptron. Phys. Rev. Lett., 71:1772-1775, 1993.]]Google ScholarGoogle ScholarCross RefCross Ref
  10. 10.E. Gardner. The space of interactions in neural network models. J. Phys., A21:257-270, 1988.]]Google ScholarGoogle Scholar
  11. 11.E. Gardner and B. Derrida. Three unfinished works on the optimal storage capacity of networks. J. Phys., A22:1983- 1994, 1989.]]Google ScholarGoogle Scholar
  12. 12.S. A. Goldman, M. J. Kearns, and R. E. Schapire. On the sample complexity of weak learning. In Proceedings of the 3rd Workshop on Computational Learning Theory, pages 217-231. Morgan Kaufmann, San Mateo, CA, 1990.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.G. GySrgyi. First-order transition to perfect generalization in a neural network with binary synapses. Phys. Ray., A41:7097-7100, 1990.]]Google ScholarGoogle Scholar
  14. 14.D. Haussler. Decision-theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1):78-150~ 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.D. Haussler, M. Kearns, and R. E. Schapire. Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension. In Proceedings of the gth Workshop on Computational Learning Theory, pages 61-74. Morgan Kaufmann, San Mateo, CA, 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.E. Levin, N. Tishby, and S. Solla. A statistical approach to learning and generalization in neural networks. In R. Rivest, editor, Proc. 3rd Annu. Workshop on Comput. Learning Theory. Morgan Kanfmann, 1989.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.Y.-D. Lyuu and I. Rivin. Tight bounds on transition to perfect generalization in perceptrons. Neural Comput., 4:854- 862, 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.G. L. Martin and J. A. Pittman. Recognizing hand-printed letters and digits using backpropagation learning. Neural Comput., 3:258-267, 1991.]] Google ScholarGoogle ScholarCross RefCross Ref
  19. 19.D. Pollard. Convergence of Stochastic Processes. Springer- Verlag, 1984.]]Google ScholarGoogle ScholarCross RefCross Ref
  20. 20.D. B. Schwartz, V. K. Samalam, J. S. Denker, and S. A. Solla. Exhaustive learning. Neural Comput., 2:374-385, 1990.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21.H. S. Setmg, H. Sompolinsky, and N. Tishby. Statistical mechanics of learning from examples. Physical Review, A45:6056-6091, 1992.]]Google ScholarGoogle Scholar
  22. 22.H. U. Simon. General bounds on the number of examples needed for learning probabilistic concepts. In Proceedings of the 6th Annual ACM Conference on Computatwnal Learning Theory, pages 402-411. ACM Press, New York, NY, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23.H. Sompolinsky, H. S. Seung, and N. Tishby. Learning curves in large neural networks. In Proc..ith Annu. Workshop on Comput. Learning Theory, pages 112-127. Morgan Kaufmann, San Mateo, CA, 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24.H. Sompolinsky, N. Tishby, and H. S. Seung. Learning from examples in large neural networks. Phys. Rev. Lett., 65(13):1683-1686, 1990.]]Google ScholarGoogle ScholarCross RefCross Ref
  25. 25.V. Vapnik, E. Levin, and Y. LeCun. Measuring the VC dimension of a learning machine. Neural Comput., 1994. To appear.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. 26.V. N. Vapnik. Estimation of Dependences Based on Empirical Data. Springer-Verlag, New York, 1982.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. 27.V. N. Vapnik and A. Y. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applicatwns, 16(2):264- 280, 1971.]]Google ScholarGoogle ScholarCross RefCross Ref
  28. 28.T. L. H. Watkin, A. Rau, and M. Biehl. The statistical mechanics of learning a rule. Rev. Mod. Phys., 65:499-556, 1993.]]Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Rigorous learning curve bounds from statistical mechanics

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          COLT '94: Proceedings of the seventh annual conference on Computational learning theory
          July 1994
          376 pages
          ISBN:0897916557
          DOI:10.1145/180139

          Copyright © 1994 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 July 1994

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate35of71submissions,49%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader