ABSTRACT
In most enterprises, databases are deployed on dedicated database servers. Often, these servers are underutilized much of the time. For example, in traces from almost 200 production servers from different organizations, we see an average CPU utilization of less than 4%. This unused capacity can be potentially harnessed to consolidate multiple databases on fewer machines, reducing hardware and operational costs. Virtual machine (VM) technology is one popular way to approach this problem. However, as we demonstrate in this paper, VMs fail to adequately support database consolidation, because databases place a unique and challenging set of demands on hardware resources, which are not well-suited to the assumptions made by VM-based consolidation.
Instead, our system for database consolidation, named Kairos, uses novel techniques to measure the hardware requirements of database workloads, as well as models to predict the combined resource utilization of those workloads. We formalize the consolidation problem as a non-linear optimization program, aiming to minimize the number of servers and balance load, while achieving near-zero performance degradation. We compare Kairos against virtual machines, showing up to a factor of 12× higher throughput on a TPC-C-like benchmark. We also tested the effectiveness of our approach on real-world data collected from production servers at Wikia.com, Wikipedia, Second Life, and MIT CSAIL, showing absolute consolidation ratios ranging between 5.5:1 and 17:1.
- A. Aboulnaga, Z. Wang, and Z. Y. Zhang. Packing the most onto your cloud. In CloudDB, 2009. Google ScholarDigital Library
- P. Apers. Data allocation in distributed database systems. ACM Transactions on Database Systems (TODS), 13(3):263--304, 1988. Google ScholarDigital Library
- S. Aulbach, T. Grust, D. Jacobs, A. Kemper, and J. Rittinger. Multi-tenant databases for software as a service: schema-mapping techniques. In SIGMOD, 2008. Google ScholarDigital Library
- M. Bennani and D. Menasce. Resource allocation for autonomic data centers using analytic performance models. In ICAC, 2005. Google ScholarDigital Library
- K. Brown, M. Carey, D. DeWitt, M. Mehta, and J. Naughton. Resource allocation and scheduling for mixed database workloads. Technical Report TR1095, University of Wisconsin - Madison CS Department, July 1992.Google Scholar
- A. Chandra, W. Gong, and P. Shenoy. Dynamic resource allocation for shared data centers using online measurements. In IWQoS, 2003. Google ScholarDigital Library
- C. Curino, E. P. C. Jones, R. A. Popa, N. Malviya, E. Wu, S. Madden, H. Balakrishnan, and N. Zeldovich. Relationalcloud: a database service for the cloud. In CIDR, 2011.Google Scholar
- A. Gulati, C. Kumar, and I. Ahmad. Modeling workloads and devices for IO load balancing in virtualized environments. SIGMETRICS Perform. Eval. Rev., 37(3):61--66, 2009. Google ScholarDigital Library
- A. Gulati, C. Kumar, and I. Ahmad. Storage workload characterization and consolidation in virtualized environments. In VPACT, 2009.Google Scholar
- S. Harizopoulos, M. Shah, J. Meza, and P. Ranganathan. Energy efficiency: The new holy grail of data management systems research. In CIDR, pages 4--7, 2009.Google Scholar
- M. Heaton. Hosting Nirvana--The Future of Shared Hosting! {Online} http://mattheaton.com/?p=185, April 2009.Google Scholar
- K. Holmström. The TOMLAB optimization environment in Matlab. Advanced Modeling and Optimization, 1(1):47--69, 1999.Google Scholar
- HP. Polyserve: Product Overview. {Online} http://h18000.www1.hp.com/products/quickspecs/12741_na/12741_na.pdf, February 2009.Google Scholar
- M. Hui, D. Jiang, G. Li, and Y. Zhou. Supporting database applications as a service. In ICDE, pages 832--843, 2009. Google ScholarDigital Library
- D. Jacobs and S. Aulbach. Ruminations on multi-tenant databases. BTW Proceedings, 2007.Google Scholar
- D. R. Jones. DIRECT global optimization algorithm. In Encyclopedia of Optimization, pages 725--735. 2009.Google ScholarCross Ref
- D. Jonker. Combining database clustering and virtualization to consolidate mission-critical servers. Jan. 2009.Google Scholar
- E. K. Lee and R. H. Katz. An analytic performance model of disk arrays. SIGMETRICS, 21(1):98--109, 1993. Google ScholarDigital Library
- O. Ozmen, K. Salem, M. Uysal, and H. S. Attar. Storage workload estimation for database management systems. In SIGMOD, 2007. Google ScholarDigital Library
- A. A. Soror, U. F. Minhas, A. Aboulnaga, K. Salem, P. Kokosielis, and S. Kamath. Automatic virtual machine configuration for database workloads. ACM Trans. Database Syst., 35(1), 2010. Google ScholarDigital Library
- G. Soundararajan, D. Lupei, S. Ghanbari, A. D. Popescu, J. Chen, and C. Amza. Dynamic resource allocation for database servers running on virtual storage. In FAST, 2009. Google ScholarDigital Library
- T. Stöhr, H. Martens, and E. Rahm. Multi-dimensional database allocation for parallel data warehouses. In VLDB, 2000. Google ScholarDigital Library
- G. Urdaneta, G. Pierre, and M. van Steen. Wikipedia workload analysis for decentralized hosting. Elsevier Computer Networks, 53(11), 2009. Google ScholarDigital Library
- B. Urgaonkar, P. Shenoy, A. Chandra, P. Goyal, and T. Wood. Agile dynamic provisioning of multi-tier internet applications. ACM Trans. Auton. Adapt. Syst., 3(1), 2008. Google ScholarDigital Library
- E. Varki, A. Merchant, J. Xu, and X. Qiu. Issues and challenges in the performance analysis of real disk arrays. IEEE TPDS, 15(6):559--574, 2004. Google ScholarDigital Library
- C. A. Waldspurger. Memory resource management in VMware ESX server. In OSDI'02, pages 181--194, 2002. Google ScholarDigital Library
Index Terms
- Workload-aware database monitoring and consolidation
Recommendations
Performance Analysis for Pareto-Optimal Green Consolidation Based on Virtual Machines Live Migration
Huge energy requirement of cloud data centers is prime concern. Dynamic Virtual Machine VM consolidation based on VM live migration to switched-off or put some of the under-loaded host Physical Machines PMs into a low power consumption mode can ...
Efficient consolidation-aware VCPU scheduling on multicore virtualization platform
Multicore processors are widely used in today's computer systems. Multicore virtualization technology provides an elastic solution to more efficiently utilize the multicore system. However, the Lock Holder Preemption (LHP) problem in the virtualized ...
Improving performance by network-aware virtual machine clustering and consolidation
Modern data center consists of thousands of servers, racks and switches. Complicated structure means it requires well-designed algorithms to utilize resources of data centers efficiently. Current virtual machine scheduling algorithms mainly focus on the ...
Comments