research-article

Deep Interest Network for Click-Through Rate Prediction

Authors:
Guorui Zhou

Alibaba Group, Beijing, China

Alibaba Group, Beijing, China
View Profile

,
Xiaoqiang Zhu

Alibaba Group, Beijing, China

Alibaba Group, Beijing, China
View Profile

,
Chenru Song

Alibaba Group, Beijing, China

Alibaba Group, Beijing, China
View Profile

,
Ying Fan

Alibaba Group, Beijing, China

Alibaba Group, Beijing, China
View Profile

,
Han Zhu

Alibaba Group, Beijing, China

Alibaba Group, Beijing, China
View Profile

,
Xiao Ma

Alibaba Group, Beijing, China

Alibaba Group, Beijing, China
View Profile

,
Yanghui Yan

Alibaba Group, Beijing, China

Alibaba Group, Beijing, China
View Profile

,
Junqi Jin

Alibaba Group, Beijing, China

Alibaba Group, Beijing, China
View Profile

,
Han Li

Alibaba Group, Beijing, China

Alibaba Group, Beijing, China
View Profile

,
Kun Gai

Alibaba Group, Beijing, China

Alibaba Group, Beijing, China
View Profile

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningJuly 2018Pages 1059–1068https://doi.org/10.1145/3219819.3219823

Published:19 July 2018Publication History

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1059–1068

ABSTRACT

Click-through rate prediction is an essential task in industrial applications, such as online advertising. Recently deep learning based models have been proposed, which follow a similar Embedding&MLP paradigm. In these methods large scale sparse input features are first mapped into low dimensional embedding vectors, and then transformed into fixed-length vectors in a group-wise manner, finally concatenated together to fed into a multilayer perceptron (MLP) to learn the nonlinear relations among features. In this way, user features are compressed into a fixed-length representation vector, in regardless of what candidate ads are. The use of fixed-length vector will be a bottleneck, which brings difficulty for Embedding&MLP methods to capture user's diverse interests effectively from rich historical behaviors. In this paper, we propose a novel model: Deep Interest Network (DIN) which tackles this challenge by designing a local activation unit to adaptively learn the representation of user interests from historical behaviors with respect to a certain ad. This representation vector varies over different ads, improving the expressive ability of model greatly. Besides, we develop two techniques: mini-batch aware regularization and data adaptive activation function which can help training industrial deep networks with hundreds of millions of parameters. Experiments on two public datasets as well as an Alibaba real production dataset with over 2 billion samples demonstrate the effectiveness of proposed approaches, which achieve superior performance compared with state-of-the-art methods. DIN now has been successfully deployed in the online display advertising system in Alibaba, serving the main traffic.

References

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate Proceedings of the 3rd International Conference on Learning Representations.Google Scholar
Ducharme Réjean Bengio Yoshua et al. 2003. A neural probabilistic language model. Journal of Machine Learning Research (2003), 1137--1155. Google ScholarDigital Library
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 191--198. Google ScholarDigital Library
Cheng H. et al. 2016a. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM. Google ScholarDigital Library
Qu Y. et al. 2016b. Product-Based Neural Networks for User Response Prediction Proceedings of the 16th International Conference on Data Mining. IEEE.Google Scholar
Zhu H. et al. 2017. Optimized Cost per Click in Taobao Display Advertising Proceedings of the 23rd International Conference on Knowledge Discovery and Data Mining. ACM, 2191--2200. Google ScholarDigital Library
Tom Fawcett. 2006. An introduction to ROC analysis. Pattern recognition letters Vol. 27, 8 (2006), 861--874. Google ScholarDigital Library
Kun Gai, Xiaoqiang Zhu, et almbox. 2017. Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction. arXiv preprint arXiv:1704.05194 (2017).Google Scholar
Huifeng Guo, Ruiming Tang, et almbox. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction Proceedings of the 26th International Joint Conference on Artificial Intelligence. 1725--1731. Google ScholarDigital Library
F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems Vol. 5, 4 (2015). Google ScholarDigital Library
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision. 1026--1034. Google ScholarDigital Library
Ruining He and Julian McAuley. 2016. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering. In Proceedings of the 25th International Conference on World Wide Web. 507--517. Google ScholarDigital Library
Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2261--2269.Google Scholar
Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations.Google Scholar
Mu Li, Ziqi Liu, Alexander J Smola, and Yu-Xiang Wang. 2016. DiFacto: Distributed factorization machines. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining. 377--386. Google ScholarDigital Library
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research Vol. 9, Nov (2008), 2579--2605.Google Scholar
Julian Mcauley, Christopher Targett, Qinfeng Shi, and Van Den Hengel Anton. 2015. Image-Based Recommendations on Styles and Substitutes Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 43--52. Google ScholarDigital Library
H. Brendan Mcmahan, H. Brendan Holt, et almbox. 2014. Ad Click Prediction: a View from the Trenches. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1222--1230. Google ScholarDigital Library
Steffen Rendle. 2010. Factorization machines. In Proceedings of the 10th International Conference on Data Mining. IEEE, 995--1000. Google ScholarDigital Library
Ying Shan, T Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, and JC Mao. 2016. Deep Crossing: Web-scale modeling without manually crafted combinatorial features Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 255--262. Google ScholarDigital Library
Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research Vol. 15, 1 (2014), 1929--1958. Google ScholarDigital Library
Andreas Veit, Balazs Kovacs, et almbox. 2015. Learning Visual Clothing Style With Heterogeneous Dyadic Co-Occurrences Proceedings of the IEEE International Conference on Computer Vision. Google ScholarDigital Library
Ronald J Williams and David Zipser. 1989. A learning algorithm for continually running fully recurrent neural networks. Neural computation (1989), 270--280. Google ScholarDigital Library
Ling Yan, Wu-jun Li, Gui-Rong Xue, and Dingyi Han. 2014. Coupled group lasso for web-scale ctr prediction in display advertising Proceedings of the 31th International Conference on Machine Learning. 802--810. Google ScholarDigital Library
Shuangfei Zhai, Keng-hao Chang, Ruofei Zhang, and Zhongfei Mark Zhang. 2016. Deepintent: Learning attentions for online advertising with recurrent neural networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1295--1304. Google ScholarDigital Library

Index Terms

Deep Interest Network for Click-Through Rate Prediction
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems
  2. World Wide Web
    1. Online advertising
      1. Display advertising

Recommendations

Click-through rate prediction with the user memory network
DLP-KDD '19: Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data

Click-through rate (CTR) prediction is a critical task in online advertising systems. Models like Deep Neural Networks (DNNs) are simple but stateless. They consider each target ad independently and cannot directly extract useful information contained ...
Read More
An Ensemble-based Approach to Click-Through Rate Prediction for Promoted Listings at Etsy
ADKDD'17: Proceedings of the ADKDD'17

Etsy1 is a global marketplace where people across the world connect to make, buy and sell unique goods. Sellers at Etsy can promote their product listings via advertising campaigns similar to traditional sponsored search ads. Click-Through Rate (CTR) ...
Read More
Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction
WWW '19: The World Wide Web Conference

Click-Through Rate prediction is an important task in recommender systems, which aims to estimate the probability of a user to click on a given item. Recently, many deep models have been proposed to learn low-order and high-order feature interactions ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2018
2925 pages
ISBN:9781450355520
DOI:10.1145/3219819
General Chairs:
Yike Guo
Imperial College London
,
Faisal Farooq
IBM
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
click-through rate prediction
display advertising
e-commerce
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '18 Paper Acceptance Rate107of983submissions,11%Overall Acceptance Rate1,133of8,635submissions,13%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 947
  Total Citations
  View Citations
- 8,289
  Total Downloads
- Downloads (Last 12 months)1,059
- Downloads (Last 6 weeks)150
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Deep Interest Network for Click-Through Rate Prediction

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Click-through rate prediction with the user memory network

An Ensemble-based Approach to Click-Through Rate Prediction for Promoted Listings at Etsy

Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Deep Interest Network for Click-Through Rate Prediction

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Click-through rate prediction with the user memory network

An Ensemble-based Approach to Click-Through Rate Prediction for Promoted Listings at Etsy

Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media