research-article

Towards fair and efficient SMP virtual machine scheduling

Authors:
Jia Rao

University of Colorado at Colorado Springs, Colorado Springs, CO, USA

University of Colorado at Colorado Springs, Colorado Springs, CO, USA
View Profile

,
Xiaobo Zhou

University of Colorado at Colorado Springs, Colorado Springs, CO, USA

University of Colorado at Colorado Springs, Colorado Springs, CO, USA
View Profile

PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programmingFebruary 2014Pages 273–286https://doi.org/10.1145/2555243.2555246

Published:06 February 2014Publication History

PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

Pages 273–286

ABSTRACT

As multicore processors become prevalent in modern computer systems, there is a growing need for increasing hardware utilization and exploiting the parallelism of such platforms. With virtualization technology, hardware utilization is improved by encapsulating independent workloads into virtual machines (VMs) and consolidating them onto the same machine. SMP virtual machines have been widely adopted to exploit parallelism. For virtualized systems, such as a public cloud, fairness between tenants and the efficiency of running their applications are keys to success. However, we find that existing virtualization platforms fail to enforce fairness between VMs with different number of virtual CPUs (vCPU) that run on multiple CPUs. We attribute the unfairness to the use of per-CPU schedulers and the load imbalance on these CPUs that incur inaccurate CPU allocations. Unfortunately, existing approaches to reduce unfairness, e.g., dynamic load balancing and CPU capping, introduce significant inefficiencies to parallel workloads.

In this paper, we present Flex, a vCPU scheduling scheme that enforces fairness at VM-level and improves the efficiency of hosted parallel applications. Flex centers on two key designs: (1) dynamically adjusting vCPU weights (FlexW) on multiple CPUs to achieve VM-level fairness and (2) flexibly scheduling vCPUs (FlexS) to minimize wasted busy-waiting time. We have implemented Flex in Xen and performed comprehensive evaluations with various parallel workloads. Results show that Flex is able to achieve CPU allocations with on average no more than 5% error compared to the ideal fair allocation. Further, Flex outperforms Xen's credit scheduler and two representative co-scheduling approaches by as much as 10X for parallel applications using busy-waiting or blocking synchronization methods.

References

Amazon Elastic Compute Cloud. http://aws.amazon.com/ec2/.Google Scholar
AMD Corporation. AMD64 architecture programmer's manual volume 2: System programming. 2010.Google Scholar
M. B. Anwer, A. Nayak, N. Feamster, and L. Liu. Network i/o fairness in virtual machines. In Proc. of VISA, 2010. Google ScholarDigital Library
A. C. Arpaci-Dusseau. Implicit coscheduling: coordinated scheduling with implicit information in distributed systems. ACM Trans. Comput. Syst., 19 (3), 2001. Google ScholarDigital Library
D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The nas parallel benchmarks-summary and preliminary results. In Proc. of SC, 1991. Google ScholarDigital Library
K. based virtual machine. http://www.linux-kvm.org/.Google Scholar
K. Chakraborty, P. M. Wells, and G. S. Sohi. Supporting overcommitted virtual machines through hardware spin detection. IEEE Trans. Parallel Distrib. Syst., 23 (2), Feb. 2012. Google ScholarDigital Library
A. C. Dusseau, R. H. Arpaci, and D. E. Culler. Effective distributed scheduling of parallel workloads. In Proc. of SIGMETRICS, 1996. Google ScholarDigital Library
Intel Corporation. Intel® 64 and IA-32 Architectures Software Developer's Manual. December 2009.Google Scholar
Intel Corporation. Intel® 64 and IA-32 Architectures Software Developer's Manual. December 2009.Google Scholar
H. Kim, S. Kim, J. Jeong, J. Lee, and S. Maeng. Demand-based coordinated scheduling for smp vms. In Proc. of ASPLOS, 2013. Google ScholarDigital Library
P. Lama and X. Zhou. NINEPIN: Non-invasive and energy efficient performance isolation in virtualized servers. In Proc. of DSN, 2012. Google ScholarDigital Library
W. Lee, M. Frank, V. Lee, K. Mackenzie, and L. Rudolph. Implications of i/o for gang scheduled workloads. In Proc. of IPPS, 1997. Google ScholarDigital Library
T. Li, D. Baumberger, and S. Hahn. Efficient and scalable multiprocessor fair scheduling using distributed weighted round-robin. In Proc. of PPoPP, 2009. Google ScholarDigital Library
P. B. Menage. Adding generic process containers to the linux kernel. In Proc. of OLS, 2010.Google Scholar
M. Mitzenmacher. The power of two choices in randomized load balancing. IEEE Trans. Parallel Distrib. Syst., 12 (10), 2001. Google ScholarDigital Library
R. Nathuji, A. Kansal, and A. Ghaffarkhah. Q-clouds: managing performance interference effects for qos-aware clouds. In Proc. of EuroSys, 2010. Google ScholarDigital Library
R. Nikolaev and G. Back. Perfctr-xen: a framework for performance counter virtualization. In Proc. of VEE, 2011. Google ScholarDigital Library
D. Ongaro, A. L. Cox, and S. Rixner. Scheduling i/o in virtual machine monitors. In Proc. of VEE, 2008. Google ScholarDigital Library
J. Ousterhout. Scheduling techniques for concurrent systems. In Proc. of ICDCS, 1982.Google Scholar
A. K. Parekh and R. G. Gallager. A generalized processor sharing approach to flow control in integrated services networks: the single-node case. IEEE/ACM Trans. Netw., 1 (3), 1993. Google ScholarDigital Library
J. Rao, K. Wang, X. Zhou, and C.-Z. Xu. Optimizing virtual machine scheduling in numa multicore systems. In Proc. of HPCA, 2013. Google ScholarDigital Library
D. Shue, M. J. Freedman, and A. Shaikh. Performance isolation and fairness for multi-tenant cloud storage. In Proc. of OSDI, 2012. Google ScholarDigital Library
P. Sobalvarro, S. Pakin, W. E. Weihl, and A. A. Chien. Dynamic coscheduling on workstation clusters. In Proc. of JSSPP, 1998. Google ScholarDigital Library
X. Song, J. Shi, H. Chen, and B. Zang. Schedule processes, not vcpus. In Proc. of APSys, 2013. Google ScholarDigital Library
SPEC Java Server Benchmark. http://www.spec.org/jbb2005/.Google Scholar
O. Sukwong and H. S. Kim. Is co-scheduling too expensive for smp vms? In Proc. of EuroSys, 2011. Google ScholarDigital Library
The Apache Mahout? machine learning library. http://mahout.apache.org/.Google Scholar
The CPU Scheduler in VMware vSphere® 5.1. http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-CPU-Sched-Perf.pdf.Google Scholar
The Princeton Application Repository for Shared-Memory Computers (PARSEC) . http://parsec.cs.princeton.edu/.Google Scholar
The SPEC CPU2006 Benchmarks. http://www.spec.org/cpu2006/.Google Scholar
V. Uhlig, J. LeVasseur, E. Skoglund, and U. Dannowski. Towards scalable multiprocessor virtual machines. In phProc. of VM, 2004. Google ScholarDigital Library
VMware. http://www.vmware.com.Google Scholar
C. Weng, Q. Liu, L. Yu, and M. Li. Dynamic adaptive scheduling for virtual machines. In Proc. of HPDC, 2011. Google ScholarDigital Library
Windows Azure Open Cloud Platform. http://www.windowsazure.com.Google Scholar
Xen. http://www.xen.org/.Google Scholar
C. Xu, S. Gamage, P. N. Rao, A. Kangarlou, R. R. Kompella, and D. Xu. vslicer: latency-aware virtual machine scheduling via differentiated-frequency cpu slicing. In Proc. of HPDC, 2012. Google ScholarDigital Library
C. Xu, S. Gamage, H. Lu, R. R. Kompella, and D. Xu. vturbo: Accelerating virtual machine i/o processing using designated turbo-sliced core. In Proc. of USENIX ATC, 2013. Google ScholarDigital Library

Index Terms

Towards fair and efficient SMP virtual machine scheduling
1. General and reference
  1. Cross-computing tools and techniques
    1. Measurement
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
        Process synchronization
        Scheduling

Recommendations

Towards fair and efficient SMP virtual machine scheduling
PPoPP '14

As multicore processors become prevalent in modern computer systems, there is a growing need for increasing hardware utilization and exploiting the parallelism of such platforms. With virtualization technology, hardware utilization is improved by ...
Read More
Scheduler activations for interference-resilient SMP virtual machine scheduling
Middleware '17: Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference

The wide adoption of SMP virtual machines (VMs) and resource consolidation present challenges to efficiently executing multi-threaded programs in the cloud. An important problem is the semantic gaps between the guest OS and the hypervisor. The well-...
Read More
A lock-aware virtual machine scheduling scheme for synchronization performance

In virtualized environments, multiprocessor virtual machines encounter synchronization problems such as lock holder preemption (LHP) and lock waiter preemption (LWP). When the issue happens, a virtual CPU (VCPU) waiting for such locks spins for an ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
February 2014
412 pages
ISBN:9781450326568
DOI:10.1145/2555243
General Chair:
José Moreira
IBM Research, USA
,
Program Chair:
James Larus
EPFL, Switzerland
ACM SIGPLAN Notices Volume 49, Issue 8
PPoPP '14
August 2014
390 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2692916
Editors:
Mark W. Bailey
Hamilton College, Clinton, NY
,
Rajeev Balasubramonian
University of Utah
,
Al Davis
University of Utah
,
Sarita Adve
University of Illinois at Urbana-Champ
Issue’s Table of Contents
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 February 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
multicore systems
parallel program optimization
virtual machine scheduling
Qualifiers
- research-article
Conference

Acceptance Rates
PPoPP '14 Paper Acceptance Rate28of184submissions,15%Overall Acceptance Rate230of1,014submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 27
  Total Citations
  View Citations
- 525
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Towards fair and efficient SMP virtual machine scheduling

PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards fair and efficient SMP virtual machine scheduling

Scheduler activations for interference-resilient SMP virtual machine scheduling

A lock-aware virtual machine scheduling scheme for synchronization performance