ABSTRACT
Modern systems often have complex configuration spaces. Research has shown that people often just use default settings. This practice leaves significant performance potential unrealized. In this work, we propose an approach that uses metaheuristic search algorithms to explore the configuration space of Hadoop for high-performing configurations. We present results of a set of experiments to show that our approach can find configurations that perform significantly better than defaults. We tested two metaheuristic search algorithms---coordinate descent and genetic algorithms---for three common MapReduce programs---Wordcount, Sort, and Terasort---for a total of six experiments. Our results suggest that metaheuristic search can find configurations cost-effectively that perform significantly better than baseline default configurations.
- Adrian A Canutescu and Roland L Dunbrack. 2003. Cyclic coordinate descent: A robotics algorithm for protein loop closure. Protein science 12, 5 (2003), 963--972.Google Scholar
- Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang. 2010. The Hi-Bench benchmark suite: Characterization of the MapReduce-based data analysis. In Data Engineering Workshops (ICDEW), 2010 IEEE 26th International Conference on. IEEE, 41--51.Google ScholarCross Ref
- V. Nair, T. Menzies, N. Siegmund, and S. Apel. 2017. Using Bad Learners to find Good Configurations. ArXiv e-prints (Feb. 2017).Google Scholar
- Kai Ren, YongChul Kwon, Magdalena Balazinska, and Bill Howe. 2013. Hadoop's adolescence: an analysis of Hadoop usage in scientific workloads. Proceedings of the VLDB Endowment 6, 10 (2013), 853--864. Google ScholarDigital Library
- Norbert Siegmund, Alexander Grebhahn, Sven Apel, and Christian Kästner. 2015. Performance-influence models for highly configurable systems. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM, 284--294. Google ScholarDigital Library
- Norbert Siegmund, Sergiy S. Kolesnikov, Christian Kästner, Sven Apel, Don Batory, Marko Rosenmüller, and Gunter Saake. 2012. Predicting Performance via Automated Feature-interaction Detection. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). IEEE Press, Piscataway, NJ, USA, 167--177. Google ScholarDigital Library
- Darrell Whitley. 1994. A genetic algorithm tutorial. Statistics and computing 4, 2 (1994), 65--85.Google Scholar
Index Terms
- Searching for high-performing software configurations with metaheuristic algorithms
Recommendations
On the effectiveness of incorporating randomness and memory into a multi-start metaheuristic with application to the Set Covering Problem
The construction of good starting solutions for multi-start local search heuristics is an important, yet not well-studied problem. In these heuristics, randomization methods are usually applied to explore new promising areas and memory mechanisms are ...
Metaheuristic approaches to grouping problems in high-throughput cryopreservation operations for fish sperm
High-throughput cryopreservation operations of fish sperm is a technology being developed by researchers today. This paper first formulates a grouping problem in high-throughput cryopreservation operations of fish sperm and then develops a heuristic and ...
Maintaining Configurations of Evolving Software Systems
Software configuration management ( SCM) is an emerging discipline. An important aspect of realizing SCM is the task of maintaining the configurations of evolving software systems. In this paper, we provide an approach to resolving some of the ...
Comments