skip to main content
10.1145/3055399.3055403acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
research-article

Kernel-based methods for bandit convex optimization

Published:19 June 2017Publication History

ABSTRACT

We consider the adversarial convex bandit problem and we build the first poly(T)-time algorithm with poly(n) √T-regret for this problem. To do so we introduce three new ideas in the derivative-free optimization

literature: (i) kernel methods, (ii) a generalization of Bernoulli convolutions, and (iii) a new annealing schedule for exponential weights (with increasing learning rate). The basic version of our algorithm achieves Õ(n9.5 #8730;T)-regret, and we show that a simple variant of this algorithm can be run in poly(n log(T))-time per step at the cost of an additional poly(n) To(1) factor in the regret. These results improve upon the Õ(n11 #8730;T)-regret and exp(poly(T))-time result of the first two authors, and the log(T)poly(n) #8730;T-regret and log(T)poly(n)-time result of Hazan and Li. Furthermore we conjecture that another variant of the algorithm could achieve Õ(n1.5 #8730;T)-regret, and moreover that this regret is unimprovable (the current best lower bound being Ω(n #8730;T) and it is achieved with linear functions). For the simpler situation of zeroth order stochastic convex optimization this corresponds to the conjecture that the optimal query complexity is of order n3 / ϵ2.

Skip Supplemental Material Section

Supplemental Material

d2_sc_t5.mp4

mp4

172 MB

References

  1. J. Abernethy, E. Hazan, and A. Rakhlin. 2008.Google ScholarGoogle Scholar
  2. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization. In Proceedings of the 21st Annual Conference on Learning Theory (COLT).Google ScholarGoogle Scholar
  3. A. Agarwal, O. Dekel, and L. Xiao. 2010. Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback. In Proceedings of the 23rd Annual Conference on Learning Theory (COLT).Google ScholarGoogle Scholar
  4. A. Agarwal, D.P. Foster, D. Hsu, S.M. Kakade, and A. Rakhlin. 2011. Stochastic convex optimization with bandit feedback. In Advances in Neural Information Processing Systems (NIPS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. Bach and V. Perchet. 2016. Highly-Smooth Zero-th Order Online Optimization. In Proceedings of the 29th Annual Conference on Learning Theory (COLT).Google ScholarGoogle Scholar
  6. K. Ball, F. Barthe, and A. Naor. 2003. Entropy jumps in the presence of a spectral gap. Duke Mathematical Journal 119, 1 (2003), 41–63.Google ScholarGoogle ScholarCross RefCross Ref
  7. A. Belloni, T. Liang, H. Narayanan, and A. Rakhlin. 2015. Escaping the Local Minima via Simulated Annealing: Optimization of Approximately Convex Functions. In Proceedings of the 28th Annual Conference on Learning Theory (COLT).Google ScholarGoogle Scholar
  8. S. Bubeck and N. Cesa-Bianchi. 2012. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. Foundations and Trends in Machine Learning 5, 1 (2012), 1–122.Google ScholarGoogle ScholarCross RefCross Ref
  9. S. Bubeck, N. Cesa-Bianchi, and S.M. Kakade. 2012. Towards minimax policies for online linear optimization with bandit feedback. In Proceedings of the 25th Annual Conference on Learning Theory (COLT).Google ScholarGoogle Scholar
  10. S. Bubeck, O. Dekel, T. Koren, and Y. Peres. 2015. Bandit Convex Optimization: √ T Regret in One Dimension. In Proceedings of the 28th Annual Conference on Learning Theory (COLT).Google ScholarGoogle Scholar
  11. S. Bubeck and R. Eldan. 2016. Multi-scale exploration of convex functions and bandit convex optimization. In Proceedings of the 29th Annual Conference on Learning Theory (COLT).Google ScholarGoogle Scholar
  12. S. Bubeck, Y.T. Lee, and R. Eldan. 2016. Kernel-based methods for bandit convex optimization. arXiv preprint arXiv:1607.03084 (2016).Google ScholarGoogle Scholar
  13. A. Conn, K. Scheinberg, and L. Vicente. 2009.Google ScholarGoogle Scholar
  14. Introduction to Derivative-Free Optimization. Society for Industrial and Applied Mathematics (SIAM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Dani, T. Hayes, and S. Kakade. 2008. The price of bandit information for online optimization. In Advances in Neural Information Processing Systems (NIPS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. O. Dekel, R. Eldan, and T. Koren. 2015. Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). 2926–2934. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Erdös. 1939.Google ScholarGoogle Scholar
  18. On a family of symmetric Bernoulli convolutions. American Journal of Mathematics 61, 4 (1939), 974–976.Google ScholarGoogle Scholar
  19. A. Flaxman, A. Kalai, and B. McMahan. 2005. Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient. In In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Hazan and K. Levy. 2014. Bandit Convex Optimization: Towards Tight Bounds. In Advances in Neural Information Processing Systems (NIPS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. Hazan and Y. Li. 2016. An optimal algorithm for bandit convex optimization. arXiv preprint arXiv:1603.04350 (2016).Google ScholarGoogle Scholar
  22. O. Johnson and A. Barron. 2004. Fisher information inequalities and the central limit theorem. Probability Theory and Related Fields 129, 3 (2004), 391–409.Google ScholarGoogle ScholarCross RefCross Ref
  23. R. Kleinberg. 2004. Nearly tight bounds for the continuum-armed bandit problem. Advances in Neural Information Processing Systems (NIPS) (2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Levin. 1965. On an algorithm for the minimization of convex functions. In Soviet Mathematics Doklady, Vol. 160. 1244–1247.Google ScholarGoogle Scholar
  25. V. D. Milman and A. Pajor. 1989. Isotropic position and inertia ellipsoids and zonoids of the unit ball of a normed n-dimensional space. In Geometric aspects of functional analysis (1987–88). Lecture Notes in Math., Vol. 1376. Springer, Berlin, 64–104.Google ScholarGoogle Scholar
  26. A. Nemirovski and D. Yudin. 1983.Google ScholarGoogle Scholar
  27. Problem Complexity and Method Efficiency in Optimization. Wiley Interscience.Google ScholarGoogle Scholar
  28. D. Newman. 1965. Location of the maximum on unimodal surfaces. Journal of the ACM (JACM) 12, 3 (1965), 395–398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Peres, W. Schlag, and B. Solomyak. 2000. Sixty years of Bernoulli convolutions. In Fractal geometry and stochastics II. Springer, 39–65.Google ScholarGoogle Scholar
  30. V. Protasov. 1996. Algorithms for approximate calculation of the minimum of a convex function from its values. Mathematical Notes 59, 1 (1996), 69–74.Google ScholarGoogle ScholarCross RefCross Ref
  31. H. Rosenbrock. 1960. An automatic method for finding the greatest or least value of a function. Comput. J. 3, 3 (1960), 175–184.Google ScholarGoogle ScholarCross RefCross Ref
  32. A. Saha and A. Tewari. 2011. Improved regret guarantees for online smooth convex optimization with bandit feedback. In International Conference on Artificial Intelligence and Statistics (AISTAT). 636–642.Google ScholarGoogle Scholar
  33. O. Shamir. 2013. On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization. In Proceedings of the 26th Annual Conference on Learning Theory (COLT).Google ScholarGoogle Scholar

Index Terms

  1. Kernel-based methods for bandit convex optimization

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        STOC 2017: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing
        June 2017
        1268 pages
        ISBN:9781450345286
        DOI:10.1145/3055399

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 June 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,469of4,586submissions,32%

        Upcoming Conference

        STOC '24
        56th Annual ACM Symposium on Theory of Computing (STOC 2024)
        June 24 - 28, 2024
        Vancouver , BC , Canada

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader