skip to main content
10.1145/2783258.2788579acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

One-Pass Ranking Models for Low-Latency Product Recommendations

Published:10 August 2015Publication History

ABSTRACT

Purchase logs collected in e-commerce platforms provide rich information about customer preferences. These logs can be leveraged to improve the quality of product recommendations by feeding them to machine-learned ranking models. However, a variety of deployment constraints limit the naive applicability of machine learning to this problem. First, the amount and the dimensionality of the data make in-memory learning simply not possible. Second, the drift of customers' preference over time require to retrain the ranking model regularly with freshly collected data. This limits the time that is available for training to prohibitively short intervals. Third, ranking in real-time is necessary whenever the query complexity prevents us from caching the predictions. This constraint requires to minimize prediction time (or equivalently maximize the data throughput), which in turn may prevent us from achieving the accuracy necessary in web-scale industrial applications. In this paper, we investigate how the practical challenges faced in this setting can be tackled via an online learning to rank approach. Sparse models will be the key to reduce prediction latency, whereas one-pass stochastic optimization will minimize the training time and restrict the memory footprint. Interestingly, and perhaps surprisingly, extensive experiments show that one-pass learning preserves most of the predictive performance. Additionally, we study a variety of online learning algorithms that enforce sparsity and provide insights to help the practitioner make an informed decision about which approach to pick. We report results on a massive purchase log dataset from the Amazon retail website, as well as on several benchmarks from the LETOR corpus.

Skip Supplemental Material Section

Supplemental Material

p1789.mp4

mp4

140.8 MB

References

  1. F. Bach, R. Jenatton, J. Mairal, and G. Obozinski. Optimization with sparsity-inducing penalties. Foundations and Trends in Machine Learning, 4(1):1--106, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Bottou and Y. LeCun. Large scale online learning. In Advances in Neural Information Processing Systems, volume 16, pages 217--224, 2004.Google ScholarGoogle Scholar
  3. C. J. C. Burges, R. Ragno, and Q. V. Le. Learning to rank with nonsmooth cost functions. In Advances in Neural Information Processing Systems (NIPS), pages 193--200, 2006.Google ScholarGoogle Scholar
  4. C. J. C. Burges, K. M. Svore, P. N. Bennett, A. Pastusiak, and Q. Wu. Learning to Rank Using an Ensemble of Lambda-Gradient Models. In Proceedings of the Yahoo! Learning to Rank Challenge, held at ICML 2010, Haifa, Israel, June 25, 2010, pages 25--35, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Büttcher, C. L. Clarke, and G. V. Cormack. Information Retrieval: Implementing and Evaluating Search Engines. MIT Press, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine learning (ICML 2007), pages 129--136, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Croft, D. Metzler, and T. Strohman. Search Engines: Information Retrieval in Practice. Addison-Wesley, Boston (MA), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Defazio, F. Bach, and S. Lacoste-Julien. Saga: A fast incremental gradient method with support for non-strongly convex composite objectives. Technical report, preprint arXiv:1407.0202, 2014.Google ScholarGoogle Scholar
  9. J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 12:2121--2159, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. C. Duchi and Y. Singer. Efficient online and batch learning using forward backward splitting. Journal of Machine Learning Research, 10:2899--2934, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An Efficient Boosting Algorithm for Combining Preferences. Journal of Maching Learning Research, 4:933--969, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Herbrich, T. Graepel, and K. Obermayer. Large Margin Rank Boundaries for Ordinal Regression. In Smola, Bartlett, Schölkopf, and Schuurmans, editors, Advances in Large Margin Classifiers, chapter 7, pages 115--132. MIT Press, 2000.Google ScholarGoogle Scholar
  13. M. Hoffman, B. Shahriari, and N. de Freitas. On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 365--374, 2014.Google ScholarGoogle Scholar
  14. C. Hu, J. T. Kwok, and W. Pan. Accelerated gradient methods for stochastic optimization and online learning. In Advances in Neural Information Processing Systems, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Hutter, H. H. Hoos, and K. Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In Proceedings of LION-5, page 507--523, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 133--142, New York, NY, USA, 2002. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. R. Jones, M. Schonlau, and W. J. Welch. Efficient global optimization of expensive black-box functions. Journal of Global optimization, 13(4):455--492, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Lai, Y. Pan, C. Liu, L. Lin, and J. Wu. Sparse Learning-to-Rank via an Efficient Primal-Dual Algorithm. IEEE Transactions on Computers, 62:1221--1233, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Langford, L. Li, and T. Zhang. Sparse Online Learning via Truncated Gradient. Journal of Machine Learning Research, 10:777--801, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Laporte, R. Flamary, S. Canu, S. Déjean, and J. Mothe. Nonconvex Regularizations for Feature Selection in Ranking With Sparse SVM. IEEE Trans. Neural Netw. Learning Syst., 25(6):1118--1130, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  21. T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. D. Manning, P. Raghavan, and H. Schutze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google ScholarGoogle ScholarCross RefCross Ref
  23. B. H. McMahan. Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, Fort Lauderdale, USA, April 11--13, 2011, pages 525--533, 2011.Google ScholarGoogle Scholar
  24. H. B. McMahan, G. Holt, D. Sculley, M. Young, D. Ebner, J. Grady, L. Nie, T. Phillips, E. Davydov, D. Golovin, et al. Ad click prediction: a view from the trenches. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1222--1230. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Mohan, Z. Chen, and K. Q. Weinberger. Web-Search Ranking with Initialized Gradient Boosted Regression Trees. In Yahoo! Learning to Rank Challenge, pages 77--89, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Negahban, P. Ravikumar, M. J. Wainwright, and B. Yu. A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers. In Advances in Neural Information Processing Systems, 2009.Google ScholarGoogle Scholar
  27. M. Y. Park and T. Hastie. L1-regularization path algorithm for generalized linear models. Journal of the Royal Statistical Society. Series B, 69(4):659--677, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  28. T. Qin, T.-Y. Liu, and H. Li. A general approximation framework for direct optimization of information retrieval measures. Information Retrieval, 13(4):375--397, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Sculley. Large scale learning to rank. In NIPS Workshop on Advances in Ranking. 2009.Google ScholarGoogle Scholar
  30. J. Snoek, H. Larochelle, and R. P. Adams. Practical bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems, pages 2960--2968, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Z. Sun, T. Qin, Q. Tao, and J. Wang. Robust sparse rank learning for non-smooth ranking measures. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, Boston, MA, USA, July 19--23, 2009, pages 259--266. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Weston, S. Bengio, and N. Usunier. WSABIE: Scaling Up to Large Vocabulary Image Annotation. In IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16--22, 2011, pages 2764--2770, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. L. Xiao. Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization. Journal of Machine Learning Research, 11:2543--2596, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Xu and H. Li. Adarank: A boosting algorithm for information retrieval. In SIGIR '07: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 391--398, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. Series B, 67(2):301--320, 2005.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. One-Pass Ranking Models for Low-Latency Product Recommendations

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
        August 2015
        2378 pages
        ISBN:9781450336642
        DOI:10.1145/2783258

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 August 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader