ABSTRACT
LinkedIn is the largest professional social network in the world with more than 238M members. It provides a platform for advertisers to reach out to professionals and target them using rich profile and behavioral data. Thus, online advertising is an important business for LinkedIn. In this talk, I will give an overview of machine learning and optimization components that power LinkedIn self-serve display advertising systems. The talk will not only focus on machine learning and optimization methods, but various practical challenges that arise when running such components in a real production environment. I will describe how we overcome some of these challenges to bridge the gap between theory and practice.
The major components that will be described in details include Response prediction: The goal of this component is to estimate click-through rates (CTR) when an ad is shown to a user in a given context. Given the data sparseness due to low CTR for advertising applications in general and the curse of dimensionality, estimating such interactions is known to be a challenging. Furthermore, the goal of the system is to maximize expected revenue, hence this is an explore/exploit problem and not a supervised learning problem. Our approach takes recourse to supervised learning to reduce dimensionality and couples it with classical explore/exploit schemes to balance the explore/exploit tradeoff. In particular, we use a large scale logistic regression to estimate user and ad interactions. Such interactions are comprised of two additive terms a) stable interactions captured by using features for both users and ads whose coefficients change slowly over time, and b) ephemeral interactions that capture ad-specific residual idiosyncrasies that are missed by the stable component. Exploration is introduced via Thompson sampling on the ephemeral interactions (sample coefficients from the posterior distribution), since the stable part is estimated using large amounts of data and subject to very little statistical variance. Our model training pipeline estimates the stable part using a scatter and gather approach via the ADMM algorithm, ephemeral part is estimated more frequently by learning a per ad correction through an ad-specific logistic regression. Scoring thousands of ads at runtime under tight latency constraints is a formidable challenge when using such models, the talk will describe methods to scale such computations at runtime.
Automatic Format Selection: The presentation of ads in a given slot on a page has a significant impact on how users interact with them. Web designers are adept at creating good formats to facilitate ad display but selecting the best among those automatically is a machine learning task. I will describe a machine learning approach we use to solve this problem. It is again an explore/exploit problem but the dimensionality of this problem is much less than the ad selection problem. I will also provide a detailed description of how we deal with issues like budget pacing, bid forecasting, supply forecasting and targeting. Throughout, the ML components will be illustrated with real examples from production and evaluation metrics would be reported from live tests. Offline metrics that can be useful in evaluating methods before launching them on live traffic will also be discussed.
Index Terms
- Computational advertising: the linkedin way
Recommendations
A Framework to Harvest Page Views of Web for Banner Advertising
BDA 2015: Proceedings of the 4th International Conference on Big Data Analytics - Volume 9498Online advertising provides an opportunity for product sellers and service providers to reach customers and has become a key factor in the growth of economy. It is a major source of revenue for the major search engine and social networking sites. Search ...
Computational advertising: leveraging user interaction & contextual factors for improved ad relevance & targeting
WSDM '12: Proceedings of the fifth ACM international conference on Web search and data miningComputational advertising refers to finding the most relevant ads matching a particular context on the web. The core problem attacked in computational advertising CA is of the match making between the ads and the context. My research work aims at ...
Introduction to display advertising: a half-day tutorial
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data miningDisplay advertising is one of the two major advertising channels on the web (in addition to search advertising). Display advertising on the Web is usually done by graphical ads placed on the publishers' Web pages. There is no explicit user query, and ...
Comments