skip to main content
10.1145/3267809.3267818acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Elasecutor: Elastic Executor Scheduling in Data Analytics Systems

Published:11 October 2018Publication History

ABSTRACT

Modern data analytics systems use long-running executors to run an application's entire DAG. Executors exhibit salient time-varying resource requirements. Yet, existing schedulers simply reserve resources for executors statically, and use the peak resource demand to guide executor placement. This leads to low utilization and poor application performance.

We present Elasecutor, a novel executor scheduler for data analytics systems. Elasecutor dynamically allocates and explicitly sizes resources to executors over time according to the predicted time-varying resource demands. Rather than placing executors using their peak demand, Elasecutor strategically assigns them to machines based on a concept called dominant remaining resource to minimize resource fragmentation. Elasecutor further adaptively reprovisions resources in order to tolerate inaccurate demand prediction. Testbed evaluation on a 35-node cluster with our Spark-based prototype implementation shows that Elasecutor reduces makespan by more than 42% on average, reduces median application completion time by up to 40%, and improves cluster utilization by up to 55% compared to existing work.

References

  1. Apache Flink. http://flink.apache.org.Google ScholarGoogle Scholar
  2. Apache Hadoop. http://hadoop.apache.org.Google ScholarGoogle Scholar
  3. Apache Spark. https://spark.apache.org.Google ScholarGoogle Scholar
  4. Apache Storm. http://storm.apache.org.Google ScholarGoogle Scholar
  5. Apache Tez. http://tez.apache.org.Google ScholarGoogle Scholar
  6. Capacity Scheduler. http://bit.ly/1tGpbDN.Google ScholarGoogle Scholar
  7. Cluster Mode Overview. https://spark.apache.org/docs/2.1.0/cluster-overview.html.Google ScholarGoogle Scholar
  8. Elasecutor. https://github.com/NetX-lab/Elasecutor.Google ScholarGoogle Scholar
  9. Fair Scheduler. https://spark.apache.org/docs/2.1.0/job-scheduling.html#fair-scheduler-pools.Google ScholarGoogle Scholar
  10. HiBench. https://github.com/intel-hadoop/HiBench.Google ScholarGoogle Scholar
  11. How-to: Tune Your Apache Spark Jobs. http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2.Google ScholarGoogle Scholar
  12. Kubernetes. https://kubernetes.io/.Google ScholarGoogle Scholar
  13. OpenJDK. http://openjdk.java.net.Google ScholarGoogle Scholar
  14. Resource Allocation Policy in Spark 2.1.0. https://spark.apache.org/docs/2.1.0/job-scheduling.html#resource-allocation-policy.Google ScholarGoogle Scholar
  15. S. Agarwal, S. Kandula, N. Bruno, M.-C. Wu, I. Stoica, and J. Zhou. Re-optimizing Data Parallel Computing. In Proc. USENIX NSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. K. Aguilera, N. Amit, I. Calciu, X. Deguillard, J. Gandhi, P. Subrahmanyam, L. Suresh, K. Tati, R. Venkatasubramanian, and M. Wei. Remote Memory in the Age of Fast Networks. In Proc. ACM SoCC, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. O. Alipourfard, H. H. Liu, J. Chen, S. Venkataraman, M. Yu, and M. Zhang. CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics. In Proc. USENIX NSDI, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Bergstra, R. Bardenet, Y. Bengio, and B. Kegl. Algorithms for Hyper-Parameter Optimization. In Proc. NIPS, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. Boutin, J. Ekanayake, W. Lin, B. Shi, J. Zhou, Z. Qian, M. Wu, and L. Zhou. Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing. In Proc. USENIX OSDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Chowdhury and I. Stoica. Efficient Coflow Scheduling Without Prior Knowledge. In Proc. ACM SIGCOMM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Curino, D. E. Difallah, C. Douglas, S. Krishnan, R. Ramakrishnan, and S. Rao. Reservation-based Scheduling: If You're Late Don't Blame Us! In Proc. ACM SoCC, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In Proc. OSDI, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Delimitrou and C. Kozyrakis. Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters. In Proc. ACM ASPLOS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Delimitrou and C. Kozyrakis. Quasar: Resource-Efficient and QoS-Aware Cluster Management. In Proc. ACM ASPLOS, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. Drucker, C. J. C. Burges, L. Kaufman, A. Smola, and V. Vapnik. Support Vector Regression Machines. In Proc. NIPS, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. D. Ferguson, P. Bodik, S. Kandula, E. Boutin, and R. Fonseca. Jockey: Guaranteed Job Latency in Data Parallel Clusters. In Proc. ACM EuroSys, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Garefalakis, K. Karanasos, P. Pietzuch, A. Suresh, and S. Rao. Medea: Scheduling of Long Running Applications in Shared Production Clusters. In Proc. ACM EuroSys, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. R. Garey, R. L. Graham, and J. D. Ullman. Worst-case Analysis of Memory Allocation Algorithms. In Proc. ACM STOC, 1972. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types. In Proc. USENIX NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Z. Gong, X. Gu, and J. Wilkes. PRESS: PRedictive Elastic ReSource Scaling for cloud systems. In Proc. IEEE CNSM, 2010.Google ScholarGoogle Scholar
  31. R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella. Multi-Resource Packing for Cluster Schedulers. In Proc. ACM SIGCOMM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. Grandl, M. Chowdhury, A. Akella, and G. Ananthanarayanan. Altruistic Scheduling in Multi-Resource Clusters. In Proc. USENIX OSDI, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. Grandl, S. Kandula, S. Rao, A. Akella, and J. Kulkarni. Graphene: Packing and Dependency-Aware Scheduling for Data-Parallel Clusters. In Proc. USENIX OSDI, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Grzegorz, H. A. Matthew, J. C. B. Aart, C. D. James, H. Ilan, L. Naty, and C. Grzegorz. Pregel: A System for Large-Scale Graph Processing. In Proc. ACM SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In Proc. USENIX NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. C. Iorgulescu, F. Dinu, A. Raza, W. U. Hassan, and W. Zwaenepoel. Don't cry over spilled records: Memory elasticity of data-parallel applications and its application to cluster scheduling. In Proc. USENIX ATC, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed Data-parallel Programs from Sequential Building Blocks. In Proc. ACM EuroSys, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: Fair Scheduling for Distributed Computing Clusters. In Proc. ACM SOSP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. E. G. Joseph, S. X. Reynold, D. Ankur, C. Daniel, J. F. Michael, and S. Ion. GraphX: Graph Processing in a Distributed Dataflow Framework. In Proc. USENIX OSDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. K. Karanasos, S. Rao, C. Curino, C. Douglas, K. Chaliparambil, G. M. Fumarola, S. Heddaya, R. Ramakrishnan, and S. Sakalanaga. Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters. In Proc. USENIX ATC, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. S. Kim, S. Huh, Y. Hu, X. Zhang, E. Witchel, A. Wated, and M. Silberstein. GPUnet: Networking Abstractions for GPU Programs. In Proc. USENIX OSDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. R. Kondor and T. Jebara. A Kernel Between Sets of Vectors. In Proc. ICML, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. T. Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu, R. Ramanathan, Y. Iwata, H. Inoue, T. Hama, and S. Shenker. Onix: A distributed control platform for large-scale production networks. In Proc. OSDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. M. Kornacker, A. Behm, V. Bittorf, T. Bobrovytsky, C. Ching, A. Choi, J. Erickson, M. Grund, D. Hecht, M. Jacobs, I. Joshi, L. Kuff, D. Kumar, A. Leblang, N. Li, I. Pandis, H. Robinson, D. Rorke, S. Rus, J. Russell, D. Tsirogiannis, S. Wanderman-Milne, and M. Yoder. Impala: A Modern, Open-Source SQL Engine for Hadoop. In Proc. CIDR, 2015.Google ScholarGoogle Scholar
  45. M. Korupolu, A. Singh, and B. Bamba. Coupled placement in modern data centers. In Proc. IPTPS, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. A. Kuzmanovska, R. H. Mak, and D. Epema. KOALA-F: A Resource Manager for Scheduling Frameworks in Clusters. In Proc. IEEE/ACM CCGrid, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. B. Li, K. Tan, L. L. Luo, Y. Peng, R. Luo, N. Xu, Y. Xiong, P. Cheng, and E. Chen. ClickNP: Highly Flexible and High Performance Network Processing with Reconfigurable Hardware. In Proc. ACM SIGCOMM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Y. Lu, A. Chowdhery, and S. Kandula. Optasia: A Relational Platform for Efficient Large-Scale Video Analytics. In Proc. ACM SoCC, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. H. Mao, M. Alizadeh, I. Menache, and S. Kandula. Resource Management with Deep Reinforcement Learning. In Proc. ACM HotNets, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Z. Matei, D. Tathagata, L. Haoyuan, H. Timothy, S. Scott, and S. Ion. Discretized Streams: Fault-Tolerant Streaming Computation at Scale. In Proc. ACM SOSP, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. C. Michele, R. Yan, and L. Zheng. Adaptive Kernel Approximation for Large-Scale Non-Linear SVM Prediction. In Proc. ICML, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. K. Morton, M. Balazinska, and D. Grossman. ParaTimer: A Progress Indicator for MapReduce DAGs. In Proc. ACM SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. H. Nguyen, Z. Shen, X. Gu, S. Subbiah, and J. Wilkes. AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service. In Proc. USENIX ICAC, 2013.Google ScholarGoogle Scholar
  54. K. Ousterhout, C. Canel, S. Ratnasamy, and S. Shenker. Monotasks: Architecting for Performance Clarity in Data Analytics Frameworks. In Proc. ACM SOSP, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. K. Ousterhout, R. Rasti, S. Ratnasamy, S. Shenker, and B.-G. Chun. Making Sense of Performance in Data Analytics Frameworks. In Proc. USENIX NSDI, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. K. Ousterhout, P. Wendell, M. Zaharia, and I. Stoica. Sparrow: Distributed, Low Latency Scheduling. In Proc. ACM SOSP, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. J. Rasley, K. Karanasos, S. Kandula, R. Fonseca, M. Vojnovic, and S. Rao. Efficient queue management for cluster scheduling. In Proc. ACM EuroSys, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. A. Roy, H. Zeng, J. Bagga, G. Porter, and A. C. Snoeren. Inside the Social Network's (Datacenter) Network. In Proc. ACM SIGCOMM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. B. Schoikopr, P. Bartlett, A. Smola, and R. Williamson. Shrinking the Thbe: A New Support Vector Regression Algorithm. In Proc. NIPS, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. M. Schwarzkopf, A. Konwinski, M. Abd-El-Malek, and J. Wilkes. Omega: Flexible, Scalable Schedulers for Large Compute Clusters. In Proc. EuroSys, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Z. Shen, S. Subbiah, X. Gu, and J. Wilkes. CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems. In Proc. ACM SoCC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. A. Singh, J. Ong, A. Agarwal, G. Anderson, A. Armistead, R. Bannon, S. Boving, G. Desai, B. Felderman, P. Germano, A. Kanagala, J. Provost, J. Simmons, E. Tanda, J. Wanderer, U. Hölzle, S. Stuart, and A. Vahdat. Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network. In Proc. ACM SIGCOMM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. A. J. Smola and B. Schölkopf. A Tutorial on Support Vector Regression. http://www.svms.org/regression/SmSc98.pdf, Technical Report, 1998.Google ScholarGoogle Scholar
  64. J. Son, Y. Xiong, K. Tan, P. Wang, Z. Gan, and S. Moon. Protego: Cloud-Scale Multitenant IPsec Gateway. In Proc. USENIX ATC, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. A. Toshniwal, S. Taneja, A. Shukla, K. Ramasamy, J. M. Patel, S. Kulkarni, J. Jackson, K. Gade, M. Fu, J. Donham, N. Bhagat, S. Mittal, and D. Ryaboy. Storm@Twitter. In Proc. ACM SIGMOD, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. A. Toshniwal, S. Taneja, A. Shukla, K. Ramasamy, J. M. Patel, S. Kulkarni, J. Jackson, K. Gade, M. Fu, J. Donham, N. Bhagat, S. Mittal, and D. Ryaboy. Morpheus: Towards Automated SLOs for Enterprise Clusters. In Proc. USENIX OSDI, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth, et al. Apache Hadoop YARN: Yet Another Resource Negotiator. In Proc. ACM SoCC, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. S. Venkataraman, Z. Yang, M. Franklin, B. Recht, and I. Stoica. Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics. In Proc. USENIX NSDI, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune, and J. Wilkes. Large-scale Cluster Management at Google with Borg. In Proc. ACM EuroSys, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. J. Wang and M. Balazinska. Elastic Memory Management for Cloud Data Analytics. In Proc. USENIX ATC, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. M. Weimer, Y. Chen, B.-G. Chun, T. Condie, C. Curino, C. Douglas, Y. Lee, T. Majestro, D. Malkhi, S. Matusevych, B. Myers, S. Narayanamurthy, R. Ramakrishnan, S. Rao, R. Sears, B. Sezgin, and J. Wang. REEF: Retainable Evaluator Execution Framework. In Proc. ACM SIGMOD, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. G. J. Woeginger. There is no asymptotic PTAS for two-dimensional vector packing. Information Processing Letters, 64:293--297, June 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. W. Xia, H. Jiang, D. Feng, and Y. Hua. SiLo: A Similarity-Locality based Near-Exact Deduplication Scheme with Low RAM Overhead and High Throughput. In Proc. USENIX ATC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. D. Xie, N. Ding, Y. C. Hu, and R. Kompella. The Only Constant is Change: Incorporating Time-Varying Network Reservations in Data Centers. In Proc. ACM SIGCOMM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. G. Xu and C.-Z. Xu. Prometheus: Online Estimation of Optimal Memory Demands for Workers in In-memory Distributed Computation. In Proc. ACM SoCC poster, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. G. Xu, C.-Z. Xu, and S. Jiang. Prophet: Scheduling Executors with Time-varying Resource Demands on Data-Parallel Computation Frameworks. In Proc. USENIX ICAC, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  77. Y. Yan, Y. Gao, Y. Chen, Z. Guo, B. Chen, and T. Moscibroda. TR-Spark: Transient Computing for Big Data Analytics. In Proc. ACM SoCC, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. In Proc. ACM EuroSys, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In Proc. USENIX NSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. H. Zhang, L. Stafman, A. Or, and M. J. Freedman. SLAQ: Quality-Driven Scheduling for Distributed Machine Learning. In Proc. ACM SoCC, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Elasecutor: Elastic Executor Scheduling in Data Analytics Systems

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SoCC '18: Proceedings of the ACM Symposium on Cloud Computing
          October 2018
          546 pages
          ISBN:9781450360111
          DOI:10.1145/3267809

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 11 October 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate169of722submissions,23%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader