skip to main content
10.1145/3152434.3152435acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

Harvesting Randomness to Optimize Distributed Systems

Published:30 November 2017Publication History

ABSTRACT

We view randomization through the lens of statistical machine learning: as a powerful resource for offline optimization. Cloud systems make randomized decisions all the time (e.g., in load balancing), yet this randomness is rarely used for optimization after-the-fact. By casting system decisions in the framework of reinforcement learning, we show how to collect data from existing systems, without modifying them, to evaluate new policies, without deploying them. Our methodology, called harvesting randomness, has the potential to accurately estimate a policy's performance without the risk or cost of deploying it on live traffic. We quantify this optimization power and apply it to a real machine health scenario in Azure Compute. We also apply it to two prototyped scenarios, for load balancing (Nginx) and caching (Redis), with much less success, and use them to identify the systems and machine learning challenges to achieving our goal.

Our long-term agenda is to harvest the randomness in distributed systems to develop non-invasive and efficient techniques for optimizing them. Like CPU cycles and bandwidth, we view randomness as a valuable resource being wasted by the cloud, and we seek to remedy this.

Skip Supplemental Material Section

Supplemental Material

lecuyer.mp4

mp4

1 GB

References

  1. Alekh Agarwal, Sarah Bird, Markus Cozowicz, Luong Hoang, John Langford, Stephen Lee, Jiaji Li, Dan Melamed, Gal Oshri, Oswaldo Ribas, Siddhartha Sen, and Alex Slivkins. 2017. The Power of Offline Evaluation in Online Decision Making. CoRR abs/1606.03966v2.Google ScholarGoogle Scholar
  2. Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, and Robert Schapire. 2014. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits. In 31st Intl. Conf. on Machine Learning (ICML). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Enda Barrett, Enda Howley, and Jim Duggan. 2013. Applying reinforcement learning towards automating resource allocation and application scalability in the cloud. Concurrency and Computation: Practice and Experience (2013).Google ScholarGoogle Scholar
  4. Léon Bottou, Jonas Peters, Joaquin Quiñonero Candela, Denis Xavier Charles, Max Chickering, Elon Portugaly, Dipankar Ray, Patrice Y. Simard, and Ed Snelson. 2013. Counterfactual reasoning and learning systems: the example of computational advertising. J. Mach. Learn. Res. (JMLR) (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Xiangping Bu, Jia Rao, and Cheng-Zhong Xu. 2009. A reinforcement learning approach to online web systems auto-configuration. In International Conference on Distributed Computing Systems (ICDCS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Xiangping Bu, Jia Rao, and Cheng-Zhong Xu. 2013. Coordinated Self-Configuration of Virtual Machines and Appliances Using a Model-Free Learning Approach. IEEE Trans. Parallel Distrib. Syst. (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Miroslav Dudík, John Langford, and Lihong Li. 2011. Doubly Robust Policy Evaluation and Learning. In Intl. Conf. on Machine Learning (ICML). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Alan S. Gerber and Donald P. Green. 2012. Field Experiments: Design, Analysis, and Interpretation. W.W. Norton&Co, Inc.Google ScholarGoogle Scholar
  9. D. G. Horvitz and D. J. Thompson. 1952. A Generalization of Sampling Without Replacement from a Finite Universe. J. Amer. Statist. Assoc.Google ScholarGoogle ScholarCross RefCross Ref
  10. Kevin G Jamieson, Lalit Jain, Chris Fernandez, Nicholas J Glattard, and Rob Nowak. 2015. NEXT: A System for Real-World Development, Evaluation, and Application of Active Learning. In Advances in Neural Information Processing Systems (NIPS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Junchen Jiang, Vyas Sekar, Ion Stoica, and Hui Zhang. 2010. Unleashing the Potential of Data-Driven Networking. In International Conference on Communication Systems and Networks (COMSNETS).Google ScholarGoogle Scholar
  12. Junchen Jiang, Shijie Sun, Vyas Sekar, and Hui Zhang. 2017. Pytheas: Enabling Data-Driven Quality of Experience Optimization Using Group-Based Exploration-Exploitation. In USENIX Symposium on Networked Systems Design and Implementation (NSDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nan Jiang and Lihong Li. 2016. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning. In Intl. Conf. on Machine Learning (ICML). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ron Kohavi, Alex Deng, Brian Frasca, Toby Walker, Ya Xu, and Nils Pohlmann. 2013. Online controlled experiments at large scale. In ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (KDD). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ron Kohavi and Roger Longbotham. 2015. Online Controlled Experiments and A/B Tests. In Encyclopedia of Machine Learning and Data Mining, Claude Sammut and Geoff Webb (Ed.). Springer. To appear.Google ScholarGoogle Scholar
  16. Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal M. Henne. 2009. Controlled experiments on the web: survey and practical guide. Data Min. Knowl. Discov. (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. John Langford, Alexander Strehl, and Jennifer Wortman. 2008. Exploration Scavenging. In Intl. Conf. on Machine Learning (ICML). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. John Langford and Tong Zhang. 2007. The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits. In Advances in Neural Information Processing Systems (NIPS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Intl. World Wide Web Conf. (WWW). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lihong Li, Wei Chu, John Langford, and Xuanhui Wang. 2011. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In ACM Intl. Conf. on Web Search and Data Mining (WSDM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Konstantinos Lolos, Ioannis Konstantinou, Verena Kantere, and Nectarios Koziris. 2017. Elastic Resource Management with Adaptive State Space Partitioning of Markov Decision Processes. arXiv preprint arXiv:1702.02978 (2017).Google ScholarGoogle Scholar
  22. Travis Mandel, Yun-En Liu, Sergey Levine, Emma Brunskill, and Zoran Popovic. 2014. Offline Policy Evaluation Across Representations with Applications to Educational Games. In International Conference on Autonomous Agents and Multi-agent Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Shie Mannor, Duncan Simester, Peng Sun, and John N Tsitsiklis. 2007. Bias and variance approximation in value function estimates. Management Science (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hongzi Mao, Mohammad Alizadeh, Ishai Menache, and Srikanth Kandula. 2016. Resource Management with Deep Reinforcement Learning. In ACM Workshop on Hot Topics in Networks (HotNets). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Microsoft. Accessed in 2017. Custom Decision Service. http://ds.microsoft.com. (Accessed in 2017).Google ScholarGoogle Scholar
  26. Azalia Mirhoseini, Hieu Pham, Quoc V Le, Benoit Steiner, Rasmus Larsen, Yuefeng Zhou, Naveen Kumar, Mohammad Norouzi, Samy Bengio, and Jeff Dean. 2017. Device Placement Optimization with Reinforcement Learning. arXiv preprint arXiv:1706.04972 (2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Netflix. 2011. The Netflix Simian Army. https://medium.com/netflix-techblog/the-netflix-simian-army-16e57fbab116. (2011).Google ScholarGoogle Scholar
  28. Nginx. Accessed in 2017. https://www.nginx.com/. (Accessed in 2017).Google ScholarGoogle Scholar
  29. Nginx. Accessed in 2017. Nginx list or variables. https://nginx.org/en/docs/varindex.html. (Accessed in 2017).Google ScholarGoogle Scholar
  30. Optimizely. Accessed in 2017. Optimizely: A/B Testing & Personalization Platform. https://www.optimizely.com/. (Accessed in 2017).Google ScholarGoogle Scholar
  31. Cosmin Paduraru. 2012. Off-policy evaluation in Markov decision processes. Ph.D. Dissertation. McGill University.Google ScholarGoogle Scholar
  32. Barry Porter, Matthew Grieves, Roberto Rodrigues Filho, and David Leslie. 2016. REX: A Development Platform and Online Learning Approach for Runtime Emergent Software Systems. In USENIX Symposium on Operating Systems Design and Implementation (OSDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Doina Precup. 2000. Eligibility traces for off-policy policy evaluation. Computer Science Department Faculty Publication Series (2000).Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jia Rao, Xiangping Bu, Cheng-Zhong Xu, Leyi Wang, and George Yin. 2009. VCONF: a reinforcement learning approach to virtual machines auto-configuration. In International conference on Autonomic computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Redis. Accessed in 2017. Redis Key-Value Store. http://http://redis.io. (Accessed in 2017).Google ScholarGoogle Scholar
  36. D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. 2014. Machine Learning: The High-Interest Credit Card of Technical Debt. In SE4ML: Software Engineering 4 Machine Learning.Google ScholarGoogle Scholar
  37. Stephen M. Stigler. 1992. A Historical View of Statistical Concepts in Psychology and Educational Research. American Journal of Education (1992).Google ScholarGoogle Scholar
  38. Richard S Sutton and Andrew G Barto. 2017. Reinforcement learning: An introduction.Google ScholarGoogle Scholar
  39. Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, and Eric Wiewiora. 2009. Fast Gradient-descent Methods for Temporal-difference Learning with Linear Function Approximation. In Intl. Conf. on Machine Learning (ICML). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Philip S Thomas, Georgios Theocharous, and Mohammad Ghavamzadeh. 2015. High-Confidence Off-Policy Evaluation. In AAAI Conference on Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Dimitrios Tsoumakos, Ioannis Konstantinou, Christina Boumpouka, Spyros Sioutas, and Nectarios Koziris. 2013. Automated, elastic resource provisioning for nosql clusters using tiramola. In Cluster, Cloud and Grid Computing (CCGrid).Google ScholarGoogle Scholar

Index Terms

  1. Harvesting Randomness to Optimize Distributed Systems
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            HotNets '17: Proceedings of the 16th ACM Workshop on Hot Topics in Networks
            November 2017
            206 pages
            ISBN:9781450355698
            DOI:10.1145/3152434

            Copyright © 2017 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 30 November 2017

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

            Acceptance Rates

            HotNets '17 Paper Acceptance Rate28of124submissions,23%Overall Acceptance Rate110of460submissions,24%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader