skip to main content
10.1145/2555243.2555246acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Towards fair and efficient SMP virtual machine scheduling

Authors Info & Claims
Published:06 February 2014Publication History

ABSTRACT

As multicore processors become prevalent in modern computer systems, there is a growing need for increasing hardware utilization and exploiting the parallelism of such platforms. With virtualization technology, hardware utilization is improved by encapsulating independent workloads into virtual machines (VMs) and consolidating them onto the same machine. SMP virtual machines have been widely adopted to exploit parallelism. For virtualized systems, such as a public cloud, fairness between tenants and the efficiency of running their applications are keys to success. However, we find that existing virtualization platforms fail to enforce fairness between VMs with different number of virtual CPUs (vCPU) that run on multiple CPUs. We attribute the unfairness to the use of per-CPU schedulers and the load imbalance on these CPUs that incur inaccurate CPU allocations. Unfortunately, existing approaches to reduce unfairness, e.g., dynamic load balancing and CPU capping, introduce significant inefficiencies to parallel workloads.

In this paper, we present Flex, a vCPU scheduling scheme that enforces fairness at VM-level and improves the efficiency of hosted parallel applications. Flex centers on two key designs: (1) dynamically adjusting vCPU weights (FlexW) on multiple CPUs to achieve VM-level fairness and (2) flexibly scheduling vCPUs (FlexS) to minimize wasted busy-waiting time. We have implemented Flex in Xen and performed comprehensive evaluations with various parallel workloads. Results show that Flex is able to achieve CPU allocations with on average no more than 5% error compared to the ideal fair allocation. Further, Flex outperforms Xen's credit scheduler and two representative co-scheduling approaches by as much as 10X for parallel applications using busy-waiting or blocking synchronization methods.

References

  1. Amazon Elastic Compute Cloud. http://aws.amazon.com/ec2/.Google ScholarGoogle Scholar
  2. AMD Corporation. AMD64 architecture programmer's manual volume 2: System programming. 2010.Google ScholarGoogle Scholar
  3. M. B. Anwer, A. Nayak, N. Feamster, and L. Liu. Network i/o fairness in virtual machines. In Proc. of VISA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. C. Arpaci-Dusseau. Implicit coscheduling: coordinated scheduling with implicit information in distributed systems. ACM Trans. Comput. Syst., 19 (3), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The nas parallel benchmarks-summary and preliminary results. In Proc. of SC, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. based virtual machine. http://www.linux-kvm.org/.Google ScholarGoogle Scholar
  7. K. Chakraborty, P. M. Wells, and G. S. Sohi. Supporting overcommitted virtual machines through hardware spin detection. IEEE Trans. Parallel Distrib. Syst., 23 (2), Feb. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. C. Dusseau, R. H. Arpaci, and D. E. Culler. Effective distributed scheduling of parallel workloads. In Proc. of SIGMETRICS, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Intel Corporation. Intel® 64 and IA-32 Architectures Software Developer's Manual. December 2009.Google ScholarGoogle Scholar
  10. Intel Corporation. Intel® 64 and IA-32 Architectures Software Developer's Manual. December 2009.Google ScholarGoogle Scholar
  11. H. Kim, S. Kim, J. Jeong, J. Lee, and S. Maeng. Demand-based coordinated scheduling for smp vms. In Proc. of ASPLOS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Lama and X. Zhou. NINEPIN: Non-invasive and energy efficient performance isolation in virtualized servers. In Proc. of DSN, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. W. Lee, M. Frank, V. Lee, K. Mackenzie, and L. Rudolph. Implications of i/o for gang scheduled workloads. In Proc. of IPPS, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Li, D. Baumberger, and S. Hahn. Efficient and scalable multiprocessor fair scheduling using distributed weighted round-robin. In Proc. of PPoPP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. B. Menage. Adding generic process containers to the linux kernel. In Proc. of OLS, 2010.Google ScholarGoogle Scholar
  16. M. Mitzenmacher. The power of two choices in randomized load balancing. IEEE Trans. Parallel Distrib. Syst., 12 (10), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Nathuji, A. Kansal, and A. Ghaffarkhah. Q-clouds: managing performance interference effects for qos-aware clouds. In Proc. of EuroSys, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Nikolaev and G. Back. Perfctr-xen: a framework for performance counter virtualization. In Proc. of VEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Ongaro, A. L. Cox, and S. Rixner. Scheduling i/o in virtual machine monitors. In Proc. of VEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Ousterhout. Scheduling techniques for concurrent systems. In Proc. of ICDCS, 1982.Google ScholarGoogle Scholar
  21. A. K. Parekh and R. G. Gallager. A generalized processor sharing approach to flow control in integrated services networks: the single-node case. IEEE/ACM Trans. Netw., 1 (3), 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Rao, K. Wang, X. Zhou, and C.-Z. Xu. Optimizing virtual machine scheduling in numa multicore systems. In Proc. of HPCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Shue, M. J. Freedman, and A. Shaikh. Performance isolation and fairness for multi-tenant cloud storage. In Proc. of OSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. Sobalvarro, S. Pakin, W. E. Weihl, and A. A. Chien. Dynamic coscheduling on workstation clusters. In Proc. of JSSPP, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. X. Song, J. Shi, H. Chen, and B. Zang. Schedule processes, not vcpus. In Proc. of APSys, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. SPEC Java Server Benchmark. http://www.spec.org/jbb2005/.Google ScholarGoogle Scholar
  27. O. Sukwong and H. S. Kim. Is co-scheduling too expensive for smp vms? In Proc. of EuroSys, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. The Apache Mahout? machine learning library. http://mahout.apache.org/.Google ScholarGoogle Scholar
  29. The CPU Scheduler in VMware vSphere® 5.1. http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-CPU-Sched-Perf.pdf.Google ScholarGoogle Scholar
  30. The Princeton Application Repository for Shared-Memory Computers (PARSEC) . http://parsec.cs.princeton.edu/.Google ScholarGoogle Scholar
  31. The SPEC CPU2006 Benchmarks. http://www.spec.org/cpu2006/.Google ScholarGoogle Scholar
  32. V. Uhlig, J. LeVasseur, E. Skoglund, and U. Dannowski. Towards scalable multiprocessor virtual machines. In phProc. of VM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. VMware. http://www.vmware.com.Google ScholarGoogle Scholar
  34. C. Weng, Q. Liu, L. Yu, and M. Li. Dynamic adaptive scheduling for virtual machines. In Proc. of HPDC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Windows Azure Open Cloud Platform. http://www.windowsazure.com.Google ScholarGoogle Scholar
  36. Xen. http://www.xen.org/.Google ScholarGoogle Scholar
  37. C. Xu, S. Gamage, P. N. Rao, A. Kangarlou, R. R. Kompella, and D. Xu. vslicer: latency-aware virtual machine scheduling via differentiated-frequency cpu slicing. In Proc. of HPDC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. Xu, S. Gamage, H. Lu, R. R. Kompella, and D. Xu. vturbo: Accelerating virtual machine i/o processing using designated turbo-sliced core. In Proc. of USENIX ATC, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards fair and efficient SMP virtual machine scheduling

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
      February 2014
      412 pages
      ISBN:9781450326568
      DOI:10.1145/2555243

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 February 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      PPoPP '14 Paper Acceptance Rate28of184submissions,15%Overall Acceptance Rate230of1,014submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader