skip to main content
10.1145/2465351.2465386acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Omega: flexible, scalable schedulers for large compute clusters

Published:15 April 2013Publication History

ABSTRACT

Increasing scale and the need for rapid response to changing requirements are hard to meet with current monolithic cluster scheduler architectures. This restricts the rate at which new features can be deployed, decreases efficiency and utilization, and will eventually limit cluster growth. We present a novel approach to address these needs using parallelism, shared state, and lock-free optimistic concurrency control.

We compare this approach to existing cluster scheduler designs, evaluate how much interference between schedulers occurs and how much it matters in practice, present some techniques to alleviate it, and finally discuss a use case highlighting the advantages of our approach -- all driven by real-life Google production workloads.

References

  1. Adaptive Computing Enterprises Inc. Maui Scheduler Administrator's Guide, 3.2 ed. Provo, UT, 2011.Google ScholarGoogle Scholar
  2. Adl-Tabatabai, A.-R., Lewis, B. T., Menon, V., Murphy, B. R., Saha, B., and Shpeisman, T. Compiler and runtime support for efficient software transactional memory. In Proceedings of PLDI (2006), pp. 26--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ananthanarayanan, G., Douglas, C., Ramakrishnan, R., Rao, S., and Stoica, I. True elasticity in multitenant data-intensive compute clusters. In Proceedings of SoCC (2012), p. 24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Apache. Hadoop On Demand. http://goo.gl/px8Yd, 2007. Accessed 20/06/2012.Google ScholarGoogle Scholar
  5. Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R. E. Bigtable: A Distributed Storage System for Structured Data. ACM Transactions on Computer Systems 26, 2 (June 2008), 4:1--4:26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chen, Y., Alspaugh, S., Borthakur, D., and Katz, R. Energy efficiency for large-scale MapReduce workloads with significant interactive analysis. In Proceedings of EuroSys (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chen, Y., Ganapathi, A. S., Griffith, R., and Katz, R. H. Design insights for MapReduce from diverse production workloads. Tech. Rep. UCB/EECS-2012-17, UC Berkeley, Jan. 2012.Google ScholarGoogle ScholarCross RefCross Ref
  8. Dean, J., and Ghemawat, S. MapReduce: Simplified data processing on large clusters. CACM 51, 1 (2008), 107--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Engler, D. R., Kaashoek, M. F., and O'Toole, Jr., J. Exokernel: an operating system architecture for application-level resource management. In Proceedings of SOSP (1995), pp. 251--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ferguson, A. D., Bodik, P., Kandula, S., Boutin, E., and Fonseca, R. Jockey: guaranteed job latency in data parallel clusters. In Proceedings of EuroSys (2012), pp. 99--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S., and Stoica, I. Dominant resource fairness: fair allocation of multiple resource types. In Proceedings of NSDI (2011), pp. 323--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Herodotou, H., Dong, F., and Babu, S. No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics. In Proceedings of SoCC (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A., Katz, R., Shenker, S., and Stoica, I. Mesos: a platform for fine-grained resource sharing in the data center. In Proceedings of NSDI (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Iqbal, S., Gupta, R., and Fang, Y.-C. Planning considerations for job scheduling in HPC clusters. Dell Power Solutions (Feb. 2005).Google ScholarGoogle Scholar
  15. Isard, M., Prabhakaran, V., Currey, J., Wieder, U., Talwar, K., and Goldberg, A. Quincy: fair scheduling for distributed computing clusters. In Proceedings of SOSP (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jackson, D. and Snell, Q. and Clement, M. Core algorithms of the Maui scheduler. In Job Scheduling Strategies for Parallel Processing. 2001, pp. 87--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kavulya, S., Tan, J., Gandhi, R., and Narasimhan, P. An analysis of traces from a production MapReduce cluster. In Proceedings of CCGrid (2010), pp. 94--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kung, H. T., and Robinson, J. T. On optimistic methods for concurrency control. ACM Transactions on Database Systems 6, 2 (June 1981), 213--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Malewicz, G., Austern, M., Bik, A., Dehnert, J., Horn, I., Leiser, N., and Czajkowski, G. Pregel: a system for large-scale graph processing. In Proceedings of SIGMOD (2010), pp. 135--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Mishra, A. K., Hellerstein, J. L., Cirne, W., and Das, C. R. Towards characterizing cloud backend workloads: insights from Google compute clusters. SIGMETRICS Performance Evaluation Review 37 (Mar. 2010), 34--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Murthy, A. C., Douglas, C., Konar, M., O'Malley, O., Radia, S., Agarwal, S., and K V, V. Architecture of next generation Apache Hadoop MapReduce framework. Tech. rep., Apache Hadoop, 2011.Google ScholarGoogle Scholar
  22. Pan, H., Hindman, B., and Asanović, K. Lithe: enabling efficient composition of parallel libraries. In Proceedings of HotPar (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Peng, D., and Dabek, F. Large-scale incremental processing using distributed transactions and notifications. In Proceedings of OSDI (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Reiss, C., Tumanov, A., Ganger, G. R., Katz, R. H., and Kozuch, M. A. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of SoCC (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sharma, B., Chudnovsky, V., Hellerstein, J., Rifaat, R., and Das, C. Modeling and synthesizing task placement constraints in Google compute clusters. In Proceedings of SoCC (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Verma, A., Cherkasova, L., and Campbell, R. SLO-driven right-sizing and resource provisioning of MapReduce jobs. In Proceedings of LADIS (2011).Google ScholarGoogle Scholar
  27. Wilkes, J. More Google cluster data. Google research blog, Nov. 2011. Posted at http://goo.gl/9B7PA.Google ScholarGoogle Scholar
  28. Zaharia, M., Borthakur, D., Sen Sarma, J., Elmeleegy, K., Shenker, S., and Stoica, I. Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. In Proceedings of EuroSys (2010), pp. 265--278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Zhang, Q., Hellerstein, J., and Boutaba, R. Characterizing task usage shapes in Google's compute clusters. In Proceedings of LADIS (2011).Google ScholarGoogle Scholar

Index Terms

  1. Omega: flexible, scalable schedulers for large compute clusters

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        EuroSys '13: Proceedings of the 8th ACM European Conference on Computer Systems
        April 2013
        401 pages
        ISBN:9781450319942
        DOI:10.1145/2465351

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 April 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        EuroSys '13 Paper Acceptance Rate28of143submissions,20%Overall Acceptance Rate241of1,308submissions,18%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader