ABSTRACT
Tasks such as search and recommendation have become increasingly important for E-commerce to deal with the information overload problem. To meet the diverse needs of different users, personalization plays an important role. In many large portals such as Taobao and Amazon, there are a bunch of different types of search and recommendation tasks operating simultaneously for personalization. However, most of current techniques address each task separately. This is suboptimal as no information about users shared across different tasks.
In this work, we propose to learn universal user representations across multiple tasks for more effective personalization. In particular, user behavior sequences (e.g., click, bookmark or purchase of products) are modeled by LSTM and attention mechanism by integrating all the corresponding content, behavior and temporal information. User representations are shared and learned in an end-to-end setting across multiple tasks. Benefiting from better information utilization of multiple tasks, the user representations are more effective to reflect their interests and are more general to be transferred to new tasks. We refer this work as Deep User Perception Network (DUPN) and conduct an extensive set of offline and online experiments. Across all tested five different tasks, our DUPN consistently achieves better results by giving more effective user representations. Moreover, we deploy DUPN in large scale operational tasks in Taobao. Detailed implementations, e.g., incremental model updating, are also provided to address the practical issues for the real world applications.
Supplemental Material
- Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).Google Scholar
- Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, Vol. 35, 8 (2013), 1798--1828. Google ScholarDigital Library
- Yoshua Bengio and Olivier Delalleau. 2011. On the expressive power of deep architectures. In Algorithmic Learning Theory. Springer, 18--36. Google ScholarDigital Library
- Adam L. Berger, Stephen A. Della Pietra, and Vincent J. Della Pietra. 1996. A Maximum Entropy approach to Natural Language Processing COMPUTATIONAL LINGUISTICS. 39--71. Google ScholarDigital Library
- Fedor Borisyuk, Liang Zhang, and Krishnaram Kenthapadi. 2017. LiJAR: A system for job application redistribution towards efficient career marketplace Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1397--1406. Google ScholarDigital Library
- Xinchi Chen, Zhan Shi, Xipeng Qiu, and Xuanjing Huang. 2017. Adversarial Multi-Criteria Learning for Chinese Word Segmentation. NIPS.Google Scholar
- Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide &deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7--10. Google ScholarDigital Library
- Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research Vol. 12, Aug (2011), 2493--2537. Google ScholarDigital Library
- Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 191--198. Google ScholarDigital Library
- John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research Vol. 12, Jul (2011), 2121--2159. Google ScholarDigital Library
- Dinesh Gopinath and Michael Strickman. 2010. Personalized advertising and recommendation. (Aug. 30. 2010). US Patent App. 12/871,416.Google Scholar
- Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput., Vol. 9, 8 (Nov. 1997), 1735--1780. Google ScholarDigital Library
- Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data Proceedings of the 22nd ACM international conference on Conference on information &knowledge management. ACM, 2333--2338. Google ScholarDigital Library
- Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer, Vol. 42, 8 (2009). Google ScholarDigital Library
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks Advances in neural information processing systems. 1097--1105. Google ScholarDigital Library
- Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon. com recommendations: Item-to-item collaborative filtering. IEEE Internet computing Vol. 7, 1 (2003), 76--80. Google ScholarDigital Library
- Xiaodong Liu, Jianfeng Gao, Xiaodong He, Li Deng, Kevin Duh, and Ye-Yi Wang. 2015. Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval. In HLT-NAACL. 912--921.Google Scholar
- Brendan McMahan. 2011. Follow-the-regularized-leader and mirror descent: Equivalence theorems and l1 regularization Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 525--533.Google Scholar
- Tomas Mikolov, Greg Corrado, Kai Chen, Jeffrey Dean, Tomas Mikolov, Greg Corrado, Kai Chen, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space International Conference on Learning Representations. 1--12.Google Scholar
- Kyo-Joong Oh, Won-Jo Lee, Chae-Gyun Lim, and Ho-Jin Choi. 2014. Personalized news recommendation using classified keywords to capture user preference Advanced Communication Technology (ICACT), 2014 16th International Conference on. IEEE, 1283--1287.Google Scholar
- Bharath Ramsundar, Steven Kearnes, Patrick Riley, Dale Webster, David Konerding, and Vijay Pande. 2015. Massively multitask networks for drug discovery. arXiv preprint arXiv:1502.02072 (2015).Google Scholar
- Rajeev Ranjan, Vishal M Patel, and Rama Chellappa. 2016. Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. arXiv preprint arXiv:1603.01249 (2016).Google Scholar
- Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton. 2007. Restricted Boltzmann machines for collaborative filtering Proceedings of the 24th international conference on Machine learning. ACM, 791--798. Google ScholarDigital Library
- Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, and Lexing Xie. 2015. Autorec: Autoencoders meet collaborative filtering Proceedings of the 24th International Conference on World Wide Web. ACM, 111--112. Google ScholarDigital Library
- Michael L. Seltzer and Jasha Droppo. 2013. Multi-task learning in deep neural networks for improved phoneme recognition Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 6965--6969.Google Scholar
- Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of machine learning research Vol. 15, 1 (2014), 1929--1958. Google ScholarDigital Library
- Yong Kiam Tan, Xinxing Xu, and Yong Liu. 2016. Improved recurrent neural networks for session-based recommendations Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 17--22. Google ScholarDigital Library
- Yury Ustinovskiy, Gleb Gusev, and Pavel Serdyukov. 2015. An optimization framework for weighting implicit relevance labels for personalized web search Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1144--1154. Google ScholarDigital Library
- Aaron Van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. In Advances in neural information processing systems. 2643--2651. Google ScholarDigital Library
- Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative deep learning for recommender systems Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1235--1244. Google ScholarDigital Library
- Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search Proceedings of the 40th international ACM SIGIR conference on Research and development in Information Retrieval. ACM, 115--124. Google ScholarDigital Library
- Yao Wu, Christopher DuBois, Alice X Zheng, and Martin Ester. 2016. Collaborative denoising auto-encoders for top-n recommender systems Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. ACM, 153--162. Google ScholarDigital Library
- Shuangfei Zhai, Keng-hao Chang, Ruofei Zhang, and Zhongfei Mark Zhang. 2016. Deepintent: Learning attentions for online advertising with recurrent neural networks Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1295--1304. Google ScholarDigital Library
- Yongfeng Zhang, Min Zhang, Yi Zhang, Guokun Lai, Yiqun Liu, Honghui Zhang, and Shaoping Ma. 2015. Daily-aware personalized recommendation based on feature-level time series analysis Proceedings of the 24th international conference on world wide web. International World Wide Web Conferences Steering Committee, 1373--1383. Google ScholarDigital Library
- Zhanpeng Zhang, Ping Luo, Chen Change Loy, and Xiaoou Tang. 2014. Facial landmark detection by deep multi-task learning European Conference on Computer Vision. Springer, 94--108.Google Scholar
- Lei Zheng, Vahid Noroozi, and Philip S. Yu. 2017. Joint deep modeling of users and items using reviews for recommendation Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ACM, 425--434. Google ScholarDigital Library
Index Terms
- Perceive Your Users in Depth: Learning Universal User Representations from Multiple E-commerce Tasks
Recommendations
Adversarial Multimodal Representation Learning for Click-Through Rate Prediction
WWW '20: Proceedings of The Web Conference 2020For better user experience and business effectiveness, Click-Through Rate (CTR) prediction has been one of the most important tasks in E-commerce. Although extensive CTR prediction models have been proposed, learning good representation of items from ...
Deep Time-Aware Item Evolution Network for Click-Through Rate Prediction
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge ManagementFor better user satisfaction and business effectiveness, Click-Through Rate (CTR) prediction is one of the most important tasks in E-commerce. It is often the case that users' interests different from their past routines may emerge or impressions such ...
Social recommendation based on users’ attention and preference
AbstractAttention is the behavioral and cognitive process of selectively concentrating on small fraction of information while ignoring other perceivable information. Thus, user’s attention will influence his decision on the consumption and ...
Comments