ABSTRACT
Visual design plays an important role in online display advertising: changing the layout of an online ad can increase or decrease its effectiveness, measured in terms of click-through rate (CTR) or total revenue. The decision of which lay- out to use for an ad involves a trade-off: using a layout provides feedback about its effectiveness (exploration), but collecting that feedback requires sacrificing the immediate reward of using a layout we already know is effective (exploitation). To balance exploration with exploitation, we pose automatic layout selection as a contextual bandit problem. There are many bandit algorithms, each generating a policy which must be evaluated. It is impractical to test each policy on live traffic. However, we have found that offline replay (a.k.a. exploration scavenging) can be adapted to provide an accurate estimator for the performance of ad layout policies at Linkedin, using only historical data about the effectiveness of layouts. We describe the development of our offline replayer, and benchmark a number of common bandit algorithms.
- D. Agarwal, B.-C. Chen, and P. Elango. Spatio-temporal models for estimating click-through rate. In WWW, pages 21--30, 2009. Google ScholarDigital Library
- J.-Y. Audibert, R. Munos, and C. Szepesvári. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theoretical Computer Science, 410(19):1876--1902, 2009. Google ScholarDigital Library
- P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2--3):235--256, 2002. Google ScholarDigital Library
- K. Bauman, A. Kornetova, V. Topinskii, and D. Khakimova. Optimization of click-through rate prediction in the Yandex search engine. Automatic Documentation and Mathematical Linguistics, 47(2):52--58, 2013.Google ScholarCross Ref
- O. Chapelle and L. Li. An empirical evaluation of Thompson sampling. In NIPS, pages 2249--2257, 2011.Google ScholarDigital Library
- O. Chapelle, E. Manavoglu, and R. Rosales. Simple and scalable response prediction for display advertising. Transactions on Intelligent Systems and Technology, (to appear), 2013.Google Scholar
- H. Cheng, E. Manavoglu, Y. Cui, R. Zhang, and J. Mao. Dynamic ad layout revenue optimization for display advertising. In Workshop on Data Mining for Online Advertising, 2012. Google ScholarDigital Library
- D. S. Diamond. A quantitative approach to magazine advertisement format selection. Journal of Marketing Research, 5(4):376--386, Nov 1968.Google ScholarCross Ref
- B. Edelman, M. Ostrovsky, and M. Schwarz. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. American Economic Review, 97(1):242--259, 2005.Google ScholarCross Ref
- B. Edelman, M. Ostrovsky, and M. Schwarz. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. American Economic Review, 99(2):430--434, 2009.Google Scholar
- T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-scale Bayesian click-through rate prediction for sponsored search advertising in Microsoft's Bing search engine. In ICML, pages 13--20, 2010.Google ScholarDigital Library
- J. Langford, A. L. Strehl, and J. Wortman. Exploration scavenging. In ICML, pages 528--535, 2008. Google ScholarDigital Library
- J. Langford and T. Zhang. The epoch-greedy algorithm for multi-armed bandits with side information. In Advances in neural information processing systems, pages 817--824, 2007.Google ScholarDigital Library
- L. Li, W. Chu, J. Langford, and X. Wang. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In I. King, W. Nejdl, and H. Li, editors, WSDM, pages 297--306. ACM, 2011. Google ScholarDigital Library
- H. B. McMahan, G. Holt, D. Sculley, M. Young, D. Ebner, J. Grady, L. Nie, T. Phillips, E. Davydov, D. Golovin, S. Chikkerur, D. Liu, M. Wattenberg, A. M. Hrafnkelsson, T. Boulos, and J. Kubica. Ad click prediction: a view from the trenches. In KDD, 2013. Google ScholarDigital Library
- M. Richardson, E. Dominowska, and R. Ragno. Predicting clicks: estimating the click-through rate for new ads. In WWW, pages 521--530, 2007. Google ScholarDigital Library
- R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction. IEEE Transactions on Neural Networks, 9(5):1054--1054, 1998. Google ScholarDigital Library
- L. Tran-Thanh, A. C. Chapman, E. M. de Cote, A. Rogers, and N. R. Jennings. Epsilon-first policies for budget-limited multi-armed bandits. In AAAI, 2010.Google ScholarDigital Library
- J. Vermorel and M. Mohri. Multi-armed bandit algorithms and empirical evaluation. In ECML, pages 437--448, 2005. Google ScholarDigital Library
Index Terms
- Automatic ad format selection via contextual bandits
Recommendations
Personalized Recommendation via Parameter-Free Contextual Bandits
SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information RetrievalPersonalized recommendation services have gained increasing popularity and attention in recent years as most useful information can be accessed online in real-time. Most online recommender systems try to address the information needs of users by virtue ...
Is Combining Contextual and Behavioral Targeting Strategies Effective in Online Advertising?
Online targeting has been increasingly used to deliver ads to consumers. But discovering how to target the most valuable web visitors and generate a high response rate is still a challenge for advertising intermediaries and advertisers. The purpose of ...
Statistical techniques for online personalized advertising: a survey
SAC '12: Proceedings of the 27th Annual ACM Symposium on Applied ComputingOnline advertising is the major source of revenue for most web service providers. Displaying advertisements that match user interests will not only lead to user satisfaction, but it will also maximize the revenues of both advertisers and web publishers. ...
Comments