skip to main content
10.1145/3292500.3330677acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open Access

TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank

Published:25 July 2019Publication History

ABSTRACT

Learning-to-Rank deals with maximizing the utility of a list of examples presented to the user, with items of higher relevance being prioritized. It has several practical applications such as large-scale search, recommender systems, document summarization and question answering. While there is widespread support for classification and regression based learning, support for learning-to-rank in deep learning has been limited. We introduce TensorFlow Ranking, the first open source library for solving large-scale ranking problems in a deep learning framework. It is highly configurable and provides easy-to-use APIs to support different scoring mechanisms, loss functions and evaluation metrics in the learning-to-rank setting. Our library is developed on top of TensorFlow and can thus fully leverage the advantages of this platform. TensorFlow Ranking has been deployed in production systems within Google; it is highly scalable, both in training and in inference, and can be used to learn ranking models over massive amounts of user activity data, which can include heterogeneous dense and sparse features. We empirically demonstrate the effectiveness of our library in learning ranking functions for large-scale search and recommendation applications in Gmail and Google Drive. We also show that ranking models built using our model scale well for distributed training, without significant impact on metrics. The proposed library is available to the open source community, with the hope that it facilitates further academic research and industrial applications in the field of learning-to-rank.

Skip Supplemental Material Section

Supplemental Material

p2970-kumar.mp4

mp4

1,014.9 MB

References

  1. Mart'in Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et almbox. 2016. Tensorflow: a system for large-scale machine learning.. In 12th USENIX Symposium on Operating Systems Design and Implementation. 265--283. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Qingyao Ai, Jiaxin Mao, Yiqun Liu, and W Bruce Croft. 2018a. Unbiased Learning to Rank: Theory and Practice. In 2018 ACM SIGIR International Conference on Theory of Information Retrieval. 1--2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Qingyao Ai, Xuanhui Wang, Nadav Golbandi, Michael Bendersky, and Marc Najork. 2018b. Learning Groupwise Scoring Functions Using Deep Neural Networks. arXiv preprint arXiv:1811.04415 (2018).Google ScholarGoogle Scholar
  4. Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence , Vol. 35, 8 (2013), 1798--1828. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In 22nd International Conference on Machine Learning . 89--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Christopher J.C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview . Technical Report Technical Report MSR-TR-2010--82. Microsoft Research.Google ScholarGoogle Scholar
  7. Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: from pairwise approach to listwise approach. In 24th International Conference on Machine Learning. 129--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Olivier Chapelle, Donald Metzler, Ya Zhang, and Pierre Grinspan. 2009. Expected Reciprocal Rank for Graded Relevance. In 18th ACM Conference on Information and Knowledge Management. 621--630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Yves Chauvin and David E Rumelhart. 2013. Backpropagation: theory, architectures, and applications .Psychology Press.Google ScholarGoogle Scholar
  10. Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arXiv preprint arXiv:1512.01274 (2015).Google ScholarGoogle Scholar
  11. Wei Chen, Tie-Yan Liu, Yanyan Lan, Zhi-Ming Ma, and Hang Li. 2009. Ranking Measures and Loss Functions in Learning to Rank. Advances in Neural Information Processing Systems. 315--323. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Heng-Tze Cheng, Zakaria Haque, Lichan Hong, Mustafa Ispir, Clemens Mewald, Illia Polosukhin, Georgios Roumpos, D Sculley, Jamie Smith, David Soergel, et almbox. 2017. Tensorflow estimators: Managing simplicity vs. flexibility in high-level machine learning frameworks. In 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1763--1771. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Wei Chu and Zoubin Ghahramani. 2005. Preference Learning with Gaussian Processes. In 22nd International Conference on Machine Learning . 137--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Nick Craswell. 2009. Mean reciprocal rank. Encyclopedia of Database Systems . Springer, 1703--1703.Google ScholarGoogle Scholar
  15. John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research , Vol. 12 (July 2011), 2121--2159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of Statistics , Vol. 29, 5 (2001), 1189--1232.Google ScholarGoogle ScholarCross RefCross Ref
  17. Norbert Fuhr. 1989. Optimum Polynomial Retrieval Functions Based on the Probability Ranking Principle. ACM Transactions on Information Systems , Vol. 7, 3 (1989), 183--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Fredric C. Gey. 1994. Inferring Probability of Relevance Using the Method of Logistic Regression. In 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 222--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep Learning .MIT Press Cambridge. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems , Vol. 20, 4 (2002), 422--446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In 22nd ACM International Conference on Multimedia. 675--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Thorsten Joachims. 2002. Optimizing Search Engines Using Clickthrough Data. 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 133--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Thorsten Joachims. 2006. Training linear SVMs in linear time. In 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 217--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately Interpreting Clickthrough Data As Implicit Feedback. In 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval . 154--161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In 10th ACM International Conference on Web Search and Data Mining. 781--789. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30. 3146--3154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yann LeCun and Yoshua Bengio. 1995. Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks, Michael A Arbib (Ed.). MIT Press, 255--258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Hang Li. 2011. Learning to rank for information retrieval and natural language processing. Synthesis Lectures on Human Language Technologies , Vol. 4, 1 (2011), 1--113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Donald A Metzler, W Bruce Croft, and Andrew Mccallum. 2005. Direct maximization of rank-based metrics for information retrieval . CIIR report 429. University of Massachusetts.Google ScholarGoogle Scholar
  30. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. Advances in Neural Information Processing Systems 26. 3111--3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Bhaskar Mitra and Nick Craswell. 2017. Neural Models for Information Retrieval. arXiv preprint arXiv:1705.01509 (2017).Google ScholarGoogle Scholar
  32. Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted Boltzmann machines. In 27th International Conference on Machine Learning. 807--814. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Christopher Olston, Noah Fiedel, Kiril Gorovoy, Jeremiah Harmsen, Li Lao, Fangwei Li, Vinu Rajashekhar, Sukriti Ramesh, and Jordan Soyke. 2017. TensorFlow-Serving: Flexible, high-performance ML serving. arXiv preprint arXiv:1712.06139 (2017).Google ScholarGoogle Scholar
  34. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In AutoDiff Workshop at NIPS 2017 .Google ScholarGoogle Scholar
  35. Tao Qin, Tie-Yan Liu, and Hang Li. 2010. A General Approximation Framework for Direct Optimization of Information Retrieval Measures. Information Retrieval , Vol. 13, 4 (2010), 375--397. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Miikka P Silfverberg, Lingshuang Jack Mao, and Mans Hulden. 2018. Sound Analogies with Phoneme Embeddings. Proc. of the Society for Computation in Linguistics (SCiL) (2018), 136--144.Google ScholarGoogle Scholar
  37. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research , Vol. 15, 1 (2014), 1929--1958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Sandeep Tata, Alexandrin Popescul, Marc Najork, Mike Colagrosso, Julian Gibbons, Alan Green, Alexandre Mah, Michael Smith, Divanshu Garg, Cayden Meyer, et almbox. 2017. Quick Access: Building a Smart Experience for Google Drive. In 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . 1643--1651. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Michael Taylor, John Guiver, Stephen Robertson, and Tom Minka. 2008. SoftRank: Optimizing Non-smooth Rank Metrics. In 1st International Conference on Web Search and Web Data Mining. 77--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In 39th International ACM SIGIR conference on Research and Development in Information Retrieval . 115--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018a. Position Bias Estimation for Unbiased Learning to Rank in Personal Search. In 11th ACM International Conference on Web Search and Data Mining. 610 --618. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Xuanhui Wang, Cheng Li, Nadav Golbandi, Michael Bendersky, and Marc Najork. 2018b. The LambdaLoss Framework for Ranking Metric Optimization. In 27th ACM International Conference on Information and Knowledge Management. 1313--1322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. 2008. Listwise Approach to Learning to Rank: Theory and Algorithm. In 25th International Conference on Machine Learning. 1192--1199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-End Neural Ad-hoc Ranking with Kernel Pooling. In 40th International ACM SIGIR Conference on Research and Development in Information Retrieval . 55--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Jun Xu and Hang Li. 2007. AdaRank: A Boosting Algorithm for Information Retrieval. In 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval . 391--398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond Position Bias: Examining Result Attractiveness As a Source of Presentation Bias in Clickthrough Data. In 19th International Conference on World Wide Web . 1011--1018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Hamed Zamani, Michael Bendersky, Xuanhui Wang, and Mingyang Zhang. 2017. Situational Context for Ranking in Personal Search. In 26th International Conference on World Wide Web. 1531--1540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Mu Zhu. 2004. Recall, precision and average precision . Technical Report. Department of Statistics and Actuarial Science, University of Waterloo.Google ScholarGoogle Scholar

Index Terms

  1. TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader