ABSTRACT
Interest has been growing in powering datacenters (at least partially) with renewable or "green" sources of energy, such as solar or wind. However, it is challenging to use these sources because, unlike the "brown" (carbon-intensive) energy drawn from the electrical grid, they are not always available. This means that energy demand and supply must be matched, if we are to take full advantage of the green energy to minimize brown energy consumption. In this paper, we investigate how to manage a datacenter's computational workload to match the green energy supply. In particular, we consider data-processing frameworks, in which many background computations can be delayed by a bounded amount of time. We propose GreenHadoop, a MapReduce framework for a datacenter powered by a photovoltaic solar array and the electrical grid (as a backup). GreenHadoop predicts the amount of solar energy that will be available in the near future, and schedules the MapReduce jobs to maximize the green energy consumption within the jobs' time bounds. If brown energy must be used to avoid time bound violations, GreenHadoop selects times when brown energy is cheap, while also managing the cost of peak brown power consumption. Our experimental results demonstrate that GreenHadoop can significantly increase green energy consumption and decrease electricity cost, compared to Hadoop.
- S. Akoush et al. Free Lunch: Exploiting Renewable Energy for Computing. In HotOS, 2011. Google ScholarDigital Library
- B. Aksanli et al. Utilizing Green Energy Prediction to Schedule Mixed Batch and Service Jobs in Data Centers. In HotPower, 2011. Google ScholarDigital Library
- H. Amur et al. Robust and Flexible Power-Proportional Storage. In SOCC, June 2010. Google ScholarDigital Library
- Apache. Apache Hadoop. http://hadoop.apache.org/.Google Scholar
- Apache. Apache Nutch. http://nutch.apache.org/.Google Scholar
- J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI, December 2004. Google ScholarDigital Library
- DSIRE. Database of State Incentives for Renewables and Efficiency. http://www.dsireusa.org/.Google Scholar
- D. Feitelson et al. Parallel Job Scheduling -- A Status Report. In JSSPP, June 2004. Google ScholarDigital Library
- I. Goiri et al. GreenSlot: Scheduling Energy Consumption in Green Datacenters. In Supercomputing, November 2011. Google ScholarDigital Library
- S. Govindan et al. Benefits and Limitations of Tapping into Stored Energy for Datacenters. In ISCA, June 2011. Google ScholarDigital Library
- A. Jossen et al. Operation conditions of batteries in PV applications. Solar Energy, 76 (6), 2004.Google Scholar
- K. Kant et al. Willow: A Control System for Energy and Thermal Adaptive Computing. In IPDPS, May 2011. Google ScholarDigital Library
- R. T. Kaushik et al. Evaluation and Analysis of GreenHDFS: A Self-Adaptive, Energy-Conserving Variant of the Hadoop Distributed File System. In CloudCom, December 2010. Google ScholarDigital Library
- A. Krioukov et al. Integrating Renewable Energy Using Data Analytics Systems: Challenges and Opportunities. Bulletin of the IEEE Computer Society Technical Committee, March 2011.Google Scholar
- A. Krioukov et al. Design and Evaluation of an Energy Agile Computing Cluster. Technical Report EECS-2012-13, University of California at Berkeley, January 2012.Google Scholar
- W. Lang and J. Patel. Energy Management for MapReduce Clusters. In VLDB, September 2010. Google ScholarDigital Library
- K. Le et al. Cost- And Energy-Aware Load Distribution Across Data Centers. In HotPower, October 2009.Google Scholar
- K. Le et al. Reducing Electricity Cost Through Virtual Machine Placement in High Performance Computing Clouds. In Supercomputing, November 2011. Google ScholarDigital Library
- K. Le et al. Capping the Brown Energy Consumption of Internet Services at Low Cost. In IGCC, August 2010. Google ScholarDigital Library
- J. Leverich and C. Kozyrakis. On the Energy (In)efficiency of Hadoop Clusters. In HotPower, October 2009. Google ScholarDigital Library
- C. Li et al. SolarCore: Solar Energy Driven Multi-core Architecture Power Management. In HPCA, February 2011. Google ScholarDigital Library
- Z. Liu et al. Greening Geographical Load Balancing. In SIGMETRICS, June 2011. Google ScholarDigital Library
- S. Oikawa and R. Rajkumar. Linux/RK: A Portable Resource Kernel in Linux. In RTAS, May 1998.Google Scholar
- A. Qureshi et al. Cutting the Electric Bill for Internet-Scale Systems. In SIGCOMM, August 2009. Google ScholarDigital Library
- P. Ranganathan et al. Ensemble-level Power Management for Dense Blade Servers. In ISCA, June 2006. Google ScholarDigital Library
- N. Sharma et al. Cloudy Computing: Leveraging Weather Forecasts in Energy Harvesting Sensor Systems. In SECON, June 2010.Google ScholarCross Ref
- N. Sharma et al. Blink: Managing Server Clusters on Intermittent Power. In ASPLOS, March 2011. Google ScholarDigital Library
- SMA. Sunny Central 800CP, 2012.Google Scholar
- SolarBuzz. Marketbuzz, 2011.Google Scholar
- C. Stewart and K. Shen. Some Joules Are More Precious Than Others: Managing Renewable Energy in the Datacenter. In HotPower, October 2009.Google Scholar
- US Environmental Protection Agency. Report to Congress on Server and Data Center Energy Efficiency, August 2007.Google Scholar
- A. Yoo et al. SLURM: Simple Linux Utility for Resource Management. In JSPP, June 2003.Google ScholarCross Ref
- M. Zaharia et al. Job Scheduling for Multi-User MapReduce Clusters. In TR UCB/EECS-2009-55, Berkeley, August 2009.Google Scholar
Index Terms
- GreenHadoop: leveraging green energy in data-processing frameworks
Recommendations
A Survey on Green-Energy-Aware Power Management for Datacenters
Megawatt-scale datacenters have emerged to meet the increasing demand for IT applications and services. The hunger for power brings large electricity bills to datacenter operators and causes significant impacts to the environment. To reduce costs and ...
Utilizing green energy prediction to schedule mixed batch and service jobs in data centers
HotPower '11: Proceedings of the 4th Workshop on Power-Aware Computing and SystemsAs brown energy costs grow, renewable energy becomes more widely used. Previous work focused on using immediately available green energy to supplement the non-renewable, or brown energy at the cost of canceling and rescheduling jobs whenever the green ...
GreenPar: Scheduling Parallel High Performance Applications in Green Datacenters
ICS '15: Proceedings of the 29th ACM on International Conference on SupercomputingWe propose GreenPar, a scheduler for parallel high-perormance applications in datacenters partially powered by on-site generation of renewable ("green'') energy. GreenPar schedules the workload to maximize the green energy consumption and minimize the ...
Comments