Abstract
Recent work has shown that machine learning can automate and in some cases outperform handcrafted compiler optimisations. Central to such an approach is that machine learning techniques typically rely upon summaries or features of the program. The quality of these features is critical to the accuracy of the resulting machine learned algorithm; no machine learning method will work well with poorly chosen features. However, due to the size and complexity of programs, theoretically there are an infinite number of potential features to choose from. The compiler writer now has to expend effort in choosing the best features from this space. This article develops a novel mechanism to automatically find those features that most improve the quality of the machine learned heuristic. The feature space is described by a grammar and is then searched with genetic programming and predictive modelling. We apply this technique to loop unrolling in GCC 4.3.1 and evaluate our approach on a Pentium 6. On a benchmark suite of 57 programs, GCCs hard-coded heuristic achieves only 3% of the maximum performance available, whereas a state-of-the-art machine learning approach with hand-coded features obtains 59%. Our feature generation technique is able to achieve 76% of the maximum available speedup, outperforming existing approaches.
- Felix Agakov, Edwin Bonilla, John Cavazos, Bjorn Franke, Grigori Fursin, Michael F. P. O'Boyle, John Thomson, Marc Toussaint, and Christopher K. I. Williams. 2006. Using machine learning to focus iterative optimization. In CGO'06: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, DC, 295--305. http://dx.doi.org/10.1109/CGO.2006.37 Google ScholarDigital Library
- Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Boston, MA: Addison-Wesley Longman. Google ScholarDigital Library
- Thomas Back. 1996. Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford University Press, Oxford, UK. Google ScholarDigital Library
- Steven J. Beaty. 1991. Genetic algorithms and instruction scheduling. In MICRO 24: Proceedings of the 24th Annual International Symposium on Microarchitecture. ACM Press, New York, NY, 206--211. http://dx.doi.org/10.1145/123465.123507 Google ScholarDigital Library
- John Cavazos and J. Eliot B. Moss. 2004. Inducing heuristics to decide whether to schedule. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation (PLDI'04). ACM, New York, NY, 183--194. http://dx.doi.org/10.1145/996841.996864 Google ScholarDigital Library
- John Cavazos and Michael F. P. O'Boyle. 2006. Method-specific dynamic compilation using logistic regression. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA'06). ACM, New York, NY, 229--240. http://dx.doi.org/10.1145/1167473.1167492 Google ScholarDigital Library
- Keith D. Cooper, Alexander Grosul, Timothy J. Harvey, Steven Reeves, Devika Subramanian, Linda Torczon, and Todd Waterman. 2005. ACME: Adaptive compilation made efficient. In Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'05). ACM, New York, NY, 69--77. http://dx.doi.org/10.1145/1065910.1065921 Google ScholarDigital Library
- Keith D. Cooper, Philip J. Schielke, and Devika Subramanian. 1999. Optimizing for reduced code space using genetic algorithms. In Proceedings of the ACM SIGPLAN 1999 Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES'99). ACM, New York, NY, 1--9. Google ScholarDigital Library
- Keith D. Cooper, Devika Subramanian, and Linda Torczon. 2002. Adaptive optimizing compilers for the 21st century. Journal of Supercomputing 23, 1, 7--22. Google ScholarDigital Library
- Christophe Dubach, Timothy M. Jones, Edwin V. Bonilla, Grigori Fursin, and Michael F. P. O'Boyle. 2009. Portable compiler optimisation across embedded programs and microarchitectures using machine learning. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (Micro-42). ACM, 78--88. http://dx.doi.org/10.1145/1669112.1669124 Google ScholarDigital Library
- Ron Kohavi and George H. John. 1997. Wrappers for feature subset selection. Artificial Intelligence 97, 1--2, 273--324. citeseer.ist.psu.edu/article/kohavi97wrappers.html. Google ScholarDigital Library
- John R. Koza. 1990. Evolution and co-evolution of computer programs to control independent-acting agents. In From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, 24-28, September 1990, Jean-Arcady Meyer and Stewart W. Wilson (Eds.). MIT Press, Paris, France, 366--375. http://citeseer.ist.psu.edu/koza91evolution.html. Google ScholarDigital Library
- Prasad Kulkarni, Stephen Hines, Jason Hiser, David Whalley, Jack Davidson, and Douglas Jones. 2004. Fast searches for effective optimization phase sequences. In PLDI'04: Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation. ACM, New York, NY, 171--182. http://dx.doi.org/10.1145/996841.996863 Google ScholarDigital Library
- Hugh Leather, Edwin Bonilla, and Michael O'Boyle. 2009. Automatic feature generation for machine learning based optimizing compilation. In CGO'09: Proceedings of the International Symposium on Code Generation and Optimization. Google ScholarDigital Library
- Amy McGovern and Eliot Moss. 1999. Scheduling straight-line code using reinforcement learning and rollouts. In Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems II. MIT Press, Cambridge, MA, 903--909. http://dl.acm.org/citation.cfm?id=340534.340836 Google ScholarDigital Library
- Antoine Monsifrot, Francois Bodin, and Rene Quiniou. 2002. A machine learning approach to automatic production of compiler heuristics. In Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications (AIMSA'02). Springer-Verlag, London, UK, 41--50. http://dl.acm.org/citation.cfm?id=646053.677574 Google ScholarDigital Library
- Zhelong Pan and Rudolf Eigenmann. 2006. Fast and effective orchestration of compiler optimizations for automatic performance tuning. In CGO'06: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, DC, 319--332. http://dx.doi.org/10.1109/CGO.2006.38 Google ScholarDigital Library
- J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA. Google ScholarDigital Library
- Conor Ryan, J. J. Collins, and Michael O Neill. 1998. Grammatical evolution: Evolving programs for an arbitrary language. In Proceedings of the 1st European Workshop on Genetic Programming, Wolfgang Banzhaf, Riccardo Poli, Marc Schoenauer, and Terence C. Fogarty (Eds.), Vol. 1391. Springer-Verlag, Paris, 83--95. http://citeseer.ist.psu.edu/ryan98grammatical.html. Google ScholarDigital Library
- Bernhard Schölkopf and Alexandra J. Smola. 2001. Learning with Kernels: Support Vector Machines Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA. Google ScholarDigital Library
- Mark Stephenson and Saman Amarasinghe. 2005. Predicting unroll factors using supervised classification. In CGO'05: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, DC, 123--134. http://dx.doi.org/10.1109/CGO.2005.29 Google ScholarDigital Library
- Mark Stephenson, Saman Amarasinghe, Martin Martin, and Una-May O'Reilly. 2003. Meta optimization: Improving compiler heuristics with machine learning. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (PLDI'03). ACM, New York, NY, 77--90. http://dx.doi.org/10.1145/781131.781141 Google ScholarDigital Library
- Spyridon Triantafyllis, Manish Vachharajani, Neil Vachharajani, and David I. August. 2003. Compiler optimization-space exploration. - IEEE Journal of Instruction-Level Parallelism 7, 204--215. Google ScholarDigital Library
- Zheng Wang and Michael F. P. O'Boyle. 2009. Mapping parallelism to multi-cores: A machine learning based approach. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'09). ACM, New York, NY, 75--84. http://dx.doi.org/10.1145/1504176.1504189 Google ScholarDigital Library
- Kamen Yotov, Xiaoming Li, Gang Ren, Michael Cibulskis, Gerald DeJong, Maria Garzaran, David Padua, Keshav Pingali, Paul Stodghill, and Peng Wu. 2003. A comparison of empirical and model-driven optimization. In PLDI'03: Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation. ACM, New York, NY, 63--76. http://dx.doi.org/10.1145/781131.781140 Google ScholarDigital Library
Index Terms
- Automatic feature generation for machine learning--based optimising compilation
Recommendations
Automatic Feature Generation for Machine Learning Based Optimizing Compilation
CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and OptimizationRecent work has shown that machine learning can automate and in some cases outperform hand crafted compiler optimizations. Central to such an approach is that machine learning techniques typically rely upon summaries or features of the program. The ...
Classifier design with feature selection and feature extraction using layered genetic programming
This paper proposes a novel method called FLGP to construct a classifier device of capability in feature selection and feature extraction. FLGP is developed with layered genetic programming that is a kind of the multiple-population genetic programming. ...
Automatic Feature Engineering by Deep Reinforcement Learning
AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent SystemsWe present a framework calledLearning Automatic Feature Engineering Machine (LAFEM), which formalizes theFeature Engineering (FE) problem as an optimization problem over aHeterogeneous Transformation Graph (HTG). We propose a Deep Q-learning on HTG to ...
Comments