research-article

Open access

TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks

Authors:

Heng-Tze Cheng,

Clemens Mewald,

Illia Polosukhin,

Georgios Roumpos,

Philipp Tucker,

Jianwei XieAuthors Info & Claims

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 1763 - 1771

https://doi.org/10.1145/3097983.3098171

Published: 13 August 2017 Publication History

Abstract

We present a framework for specifying, training, evaluating, and deploying machine learning models. Our focus is on simplifying cutting edge machine learning for practitioners in order to bring such technologies into production. Recognizing the fast evolution of the field of deep learning, we make no attempt to capture the design space of all possible model architectures in a domain-specific language (DSL) or similar configuration language. We allow users to write code to define their models, but provide abstractions that guide developers to write models in ways conducive to productionization. We also provide a unifying Estimator interface, making it possible to write downstream infrastructure (e.g. distributed training, hyperparameter tuning) independent of the model implementation.

We balance the competing demands for flexibility and simplicity by offering APIs at different levels of abstraction, making common model architectures available out of the box, while providing a library of utilities designed to speed up experimentation with model architectures. To make out of the box models flexible and usable across a wide range of problems, these canned Estimators are parameterized not only over traditional hyperparameters, but also using feature columns, a declarative specification describing how to interpret input data

We discuss our experience in using this framework in research and production environments, and show the impact on code health, maintainability, and development speed.

References

[1]

Running your models in production with TensorFlow Serving. https: //research.googleblog.com/2016/02/running-your-models-in-production-with.html, accessed 2017-02-08.

[2]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek Gordon Murray, Benoit Steiner, Paul A. Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In OSDI. 265--283.

Digital Library

[3]

Amit Agarwal, Eldar Akchurin, Chris Basoglu, Guoguo Chen, Scott Cyphers, Jasha Droppo, Adam Eversole, Brian Guenter, Mark Hillebrand, Ryan Hoens, Xuedong Huang, Zhiheng Huang, Vladimir Ivanov, Alexey Kamenev, Philipp Kranen, Oleksii Kuchaiev, Wolfgang Manousek, Avner May, Bhaskar Mitra, Olivier Nano, Gaizka Navarro, Alexey Orlov, Marko Padmilac, Hari Parthasarathi, Baolin Peng, Alexey Reznichenko, Frank Seide, Michael L. Seltzer, Malcolm Slaney, Andreas Stolcke, Yongqiang Wang, Huaming Wang, Kaisheng Yao, Dong Yu, Yu Zhang, and Geoffrey Zweig. 2014. An Introduction to Computational Networks and the Computational Network Toolkit. Technical Report MSR-TR-2014-112. http://research.microsoft.com/apps/pubs/default.aspx?id=226641

[4]

Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brebisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Ĉoté, Myriam Ĉoé, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balazs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, Françcois Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Ëtienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, and Ying Zhang. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688 (May 2016). http://arxiv.org/abs/1605.02688

[5]

Amazon. 2016. Dsstne. https://github.com/amznlabs/amazon-dsstne. (2016).

[6]

Denis Baylor, Eric Breck, Heng-Tze Cheng, Noah Fiedel, Chuan Yu Foo, Zakaria Haque, Salem Haykal, Mustafa Ispir, Vihan Jain, Levent Koc, Chiu Yuen Koo, Lukasz Lew, Clemens Mewald, Akshay Naresh Modi, Neoklis Polyzotis, Sukriti Ramesh, Sudip Roy, Steven Euijong Whang, Martin Wicke, Jarek Wilkiewicz, Xin Zhang, and Martin Zinkevich. 2017. The Anatomy of a Production-Scale Continuously-Training ML Platform. KDD. (2017).

[7]

Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. CoRR abs/1603.02754 (2016). http://arxiv.org/abs/1603.02754

Digital Library

[8]

Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. CoRR abs/1512.01274(2015). http://arxiv.org/abs/1512.01274

[9]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In DLRS. 7--10.

Digital Library

[10]

François Chollet. 2015. keras. https://github.com/fchollet/keras. (2015).

[11]

Ronan Collobert, Samy Bengio, and Johnny Marithoz. 2002. Torch: A Modular Machine Learning Software Library. (2002).

[12]

The Scipy community. 2012. NumPy Reference Guide. SciPy.org. http://docs.scipy.org/doc/numpy/reference/

[13]

Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc-Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large Scale Distributed Deep Networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS'12). Curran Associates Inc., USA, 1223--1231. http://dl.acm.org/citation.cfm?id=2999134.2999271

Digital Library

[14]

Deeplearning4j Development Team. 2016. Deeplearning4j: Open-source distributed deep learning for the JVM, Apache Software Foundation License 2.0. http://deeplearning4j.org. (2016).

[15]

Sander Dieleman, Jan Schl ü, Colin Raffel, Eben Olson, Søren Kaae Sønderby, Daniel Nouri, Daniel Maturana, Martin Thoma, Eric Battenberg, Jack Kelly, Jeffrey De Fauw, Michael Heilman, diogo149, Brian McFee, Hendrik Weideman, takacsg84, peterderivaz, Jon, instagibbs, Dr. Kashif Rasul, CongLiu, Brite-fury, and Jonas Degrave. 2015. Lasagne: First release. (Aug.2015).

[16]

Sergio Guadarrama and Nathan Silberman. 2016. TF Slim. https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim. (2016).

[17]

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. In Proceedings of the 22Nd ACM International Conference on Multimedia (MM '14). ACM, New York, NY, USA, 675--678.

Digital Library

[18]

Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, and Ameet Talwalkar. 2016. MLlib: Machine Learning in Apache Spark. J. Mach. Learn. Res. 17, 1 (Jan. 2016), 1235--1241. http://dl.acm.org/citation.cfm?id=2946645.2946679

Digital Library

[19]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 12 (Nov. 2011), 2825--2830. http://dl.acm.org/citation.cfm?id=1953048.2078195

Digital Library

[20]

Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. 2015. Chainer: a Next-Generation Open Source Framework for Deep Learning. In Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Twenty-ninth Annual Conference on Neural Information Processing Systems (NIPS). http://learningsys.org/papers/LearningSys2015paper33.pdf

[21]

Bart van Merriënboer, Dzmitry Bahdanau, Vincent Dumoulin, Dmitriy Serdyuk, David Warde-Farley, Jan Chorowski, and Yoshua Bengio. 2015. Blocks and Fuel: Frameworks for deep learning. CoRR abs/1506.00619 (2015). http://arxiv.org/abs/1506.00619

Cited By

Sang BGu SZhan XTang MLiu JChen XTan JGe HZhang KRuan RYan W(2023)Cougar: A General Framework for Jobs Optimization In Cloud2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00262(3417-3429)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00262
Tyagi SSharma P(2023)Scavenger: A Cloud Service For Optimizing Cost and Performance of ML Training2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid57682.2023.00045(403-413)Online publication date: May-2023
https://doi.org/10.1109/CCGrid57682.2023.00045
Zhang YChen LYang SYuan MYi HZhang JWang JDong JXu YSong YLi YZhang DLin WQu LZheng B(2022)PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00324(3453-3466)Online publication date: May-2022
https://doi.org/10.1109/ICDE53745.2022.00324
Show More Cited By

Index Terms

TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks
1. Computing methodologies
  1. Machine learning

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 2017

2240 pages

ISBN:9781450348874

DOI:10.1145/3097983

General Chairs:
Stan Matwin
Dalhousie University
,
Shipeng Yu
LinkedIn
,
Faisal Farooq
IBM

Copyright © 2017 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2017

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '17

Sponsor:

KDD '17: The 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 13 - 17, 2017

NS, Halifax, Canada

Acceptance Rates

KDD '17 Paper Acceptance Rate 64 of 748 submissions, 9%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
2,330
Total Downloads

Downloads (Last 12 months)154
Downloads (Last 6 weeks)19

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sang BGu SZhan XTang MLiu JChen XTan JGe HZhang KRuan RYan W(2023)Cougar: A General Framework for Jobs Optimization In Cloud2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00262(3417-3429)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00262
Tyagi SSharma P(2023)Scavenger: A Cloud Service For Optimizing Cost and Performance of ML Training2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid57682.2023.00045(403-413)Online publication date: May-2023
https://doi.org/10.1109/CCGrid57682.2023.00045
Zhang YChen LYang SYuan MYi HZhang JWang JDong JXu YSong YLi YZhang DLin WQu LZheng B(2022)PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00324(3453-3466)Online publication date: May-2022
https://doi.org/10.1109/ICDE53745.2022.00324
Ramchandani MKhandare HSingh PRajak PSuryawanshi NJangde AArya LKumar PSahu M(2022)Survey: Tensorflow in Machine LearningJournal of Physics: Conference Series10.1088/1742-6596/2273/1/0120082273:1(012008)Online publication date: 1-May-2022
https://doi.org/10.1088/1742-6596/2273/1/012008
García-Nava JFlores JTellez VCalderon F(2022)Fast training of a transformer for global multi-horizon time series forecasting on tensor processing unitsThe Journal of Supercomputing10.1007/s11227-022-05009-x79:8(8475-8498)Online publication date: 19-Dec-2022
https://doi.org/10.1007/s11227-022-05009-x
Retkowitz D(2021)Datenbasierte Algorithmen zur Unterstützung von Entscheidungen mittels künstlicher neuronaler NetzeData Science10.1007/978-3-658-33403-1_13(209-224)Online publication date: 12-Nov-2021
https://doi.org/10.1007/978-3-658-33403-1_13
Wei PLu RWang SXie S(2021)Multi-modem Implementation Method Based on Deep Autoencoder NetworkWireless and Satellite Systems10.1007/978-3-030-69072-4_40(487-501)Online publication date: 28-Feb-2021
https://doi.org/10.1007/978-3-030-69072-4_40
Wei PWang SLuo J(2021)Adaptive modem and interference suppression based on deep learningTransactions on Emerging Telecommunications Technologies10.1002/ett.4220Online publication date: 4-Feb-2021
https://doi.org/10.1002/ett.4220
Rama A Kumaravel A Nalini C (2020)Construction of Deep Convolutional Neural Networks For Medical Image ClassificationInternational Journal of Computer Vision and Image Processing10.4018/IJCVIP.20190401019:2(1-15)Online publication date: 1-Oct-2020
https://dl.acm.org/doi/10.4018/IJCVIP.2019040101
Chen YXiao GOzsu MLiu CZomaya ALi T(2020)aeSpTV: An Adaptive and Efficient Framework for Sparse Tensor-Vector Product Kernel on a High-Performance Computing PlatformIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.299042931:10(2329-2345)Online publication date: 1-Oct-2020
https://doi.org/10.1109/TPDS.2020.2990429
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten