research-article

Free Access

Automatic feature generation for machine learning--based optimising compilation

Authors:
Hugh Leather

University of Edinburgh, St, Edinburgh, Midlothian

University of Edinburgh, St, Edinburgh, Midlothian
View Profile

,
Edwin Bonilla

NICTA and Australian National University, Australia

NICTA and Australian National University, Australia
View Profile

,
Michael O'boyle

University of Edinburgh, St, Edinburgh, Midlothian

University of Edinburgh, St, Edinburgh, Midlothian
View Profile

ACM Transactions on Architecture and Code Optimization Volume 11 Issue 1Article No.: 14pp 1–32https://doi.org/10.1145/2536688

Published:01 February 2014Publication History

ACM Transactions on Architecture and Code Optimization

Abstract

Recent work has shown that machine learning can automate and in some cases outperform handcrafted compiler optimisations. Central to such an approach is that machine learning techniques typically rely upon summaries or features of the program. The quality of these features is critical to the accuracy of the resulting machine learned algorithm; no machine learning method will work well with poorly chosen features. However, due to the size and complexity of programs, theoretically there are an infinite number of potential features to choose from. The compiler writer now has to expend effort in choosing the best features from this space. This article develops a novel mechanism to automatically find those features that most improve the quality of the machine learned heuristic. The feature space is described by a grammar and is then searched with genetic programming and predictive modelling. We apply this technique to loop unrolling in GCC 4.3.1 and evaluate our approach on a Pentium 6. On a benchmark suite of 57 programs, GCCs hard-coded heuristic achieves only 3% of the maximum performance available, whereas a state-of-the-art machine learning approach with hand-coded features obtains 59%. Our feature generation technique is able to achieve 76% of the maximum available speedup, outperforming existing approaches.

References

Felix Agakov, Edwin Bonilla, John Cavazos, Bjorn Franke, Grigori Fursin, Michael F. P. O'Boyle, John Thomson, Marc Toussaint, and Christopher K. I. Williams. 2006. Using machine learning to focus iterative optimization. In CGO'06: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, DC, 295--305. http://dx.doi.org/10.1109/CGO.2006.37 Google ScholarDigital Library
Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Boston, MA: Addison-Wesley Longman. Google ScholarDigital Library
Thomas Back. 1996. Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford University Press, Oxford, UK. Google ScholarDigital Library
Steven J. Beaty. 1991. Genetic algorithms and instruction scheduling. In MICRO 24: Proceedings of the 24th Annual International Symposium on Microarchitecture. ACM Press, New York, NY, 206--211. http://dx.doi.org/10.1145/123465.123507 Google ScholarDigital Library
John Cavazos and J. Eliot B. Moss. 2004. Inducing heuristics to decide whether to schedule. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation (PLDI'04). ACM, New York, NY, 183--194. http://dx.doi.org/10.1145/996841.996864 Google ScholarDigital Library
John Cavazos and Michael F. P. O'Boyle. 2006. Method-specific dynamic compilation using logistic regression. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA'06). ACM, New York, NY, 229--240. http://dx.doi.org/10.1145/1167473.1167492 Google ScholarDigital Library
Keith D. Cooper, Alexander Grosul, Timothy J. Harvey, Steven Reeves, Devika Subramanian, Linda Torczon, and Todd Waterman. 2005. ACME: Adaptive compilation made efficient. In Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'05). ACM, New York, NY, 69--77. http://dx.doi.org/10.1145/1065910.1065921 Google ScholarDigital Library
Keith D. Cooper, Philip J. Schielke, and Devika Subramanian. 1999. Optimizing for reduced code space using genetic algorithms. In Proceedings of the ACM SIGPLAN 1999 Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES'99). ACM, New York, NY, 1--9. Google ScholarDigital Library
Keith D. Cooper, Devika Subramanian, and Linda Torczon. 2002. Adaptive optimizing compilers for the 21st century. Journal of Supercomputing 23, 1, 7--22. Google ScholarDigital Library
Christophe Dubach, Timothy M. Jones, Edwin V. Bonilla, Grigori Fursin, and Michael F. P. O'Boyle. 2009. Portable compiler optimisation across embedded programs and microarchitectures using machine learning. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (Micro-42). ACM, 78--88. http://dx.doi.org/10.1145/1669112.1669124 Google ScholarDigital Library
Ron Kohavi and George H. John. 1997. Wrappers for feature subset selection. Artificial Intelligence 97, 1--2, 273--324. citeseer.ist.psu.edu/article/kohavi97wrappers.html. Google ScholarDigital Library
John R. Koza. 1990. Evolution and co-evolution of computer programs to control independent-acting agents. In From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, 24-28, September 1990, Jean-Arcady Meyer and Stewart W. Wilson (Eds.). MIT Press, Paris, France, 366--375. http://citeseer.ist.psu.edu/koza91evolution.html. Google ScholarDigital Library
Prasad Kulkarni, Stephen Hines, Jason Hiser, David Whalley, Jack Davidson, and Douglas Jones. 2004. Fast searches for effective optimization phase sequences. In PLDI'04: Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation. ACM, New York, NY, 171--182. http://dx.doi.org/10.1145/996841.996863 Google ScholarDigital Library
Hugh Leather, Edwin Bonilla, and Michael O'Boyle. 2009. Automatic feature generation for machine learning based optimizing compilation. In CGO'09: Proceedings of the International Symposium on Code Generation and Optimization. Google ScholarDigital Library
Amy McGovern and Eliot Moss. 1999. Scheduling straight-line code using reinforcement learning and rollouts. In Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems II. MIT Press, Cambridge, MA, 903--909. http://dl.acm.org/citation.cfm&quest;id=340534.340836 Google ScholarDigital Library
Antoine Monsifrot, Francois Bodin, and Rene Quiniou. 2002. A machine learning approach to automatic production of compiler heuristics. In Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications (AIMSA'02). Springer-Verlag, London, UK, 41--50. http://dl.acm.org/citation.cfm&quest;id=646053.677574 Google ScholarDigital Library
Zhelong Pan and Rudolf Eigenmann. 2006. Fast and effective orchestration of compiler optimizations for automatic performance tuning. In CGO'06: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, DC, 319--332. http://dx.doi.org/10.1109/CGO.2006.38 Google ScholarDigital Library
J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA. Google ScholarDigital Library
Conor Ryan, J. J. Collins, and Michael O Neill. 1998. Grammatical evolution: Evolving programs for an arbitrary language. In Proceedings of the 1st European Workshop on Genetic Programming, Wolfgang Banzhaf, Riccardo Poli, Marc Schoenauer, and Terence C. Fogarty (Eds.), Vol. 1391. Springer-Verlag, Paris, 83--95. http://citeseer.ist.psu.edu/ryan98grammatical.html. Google ScholarDigital Library
Bernhard Schölkopf and Alexandra J. Smola. 2001. Learning with Kernels: Support Vector Machines Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA. Google ScholarDigital Library
Mark Stephenson and Saman Amarasinghe. 2005. Predicting unroll factors using supervised classification. In CGO'05: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, DC, 123--134. http://dx.doi.org/10.1109/CGO.2005.29 Google ScholarDigital Library
Mark Stephenson, Saman Amarasinghe, Martin Martin, and Una-May O'Reilly. 2003. Meta optimization: Improving compiler heuristics with machine learning. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (PLDI'03). ACM, New York, NY, 77--90. http://dx.doi.org/10.1145/781131.781141 Google ScholarDigital Library
Spyridon Triantafyllis, Manish Vachharajani, Neil Vachharajani, and David I. August. 2003. Compiler optimization-space exploration. - IEEE Journal of Instruction-Level Parallelism 7, 204--215. Google ScholarDigital Library
Zheng Wang and Michael F. P. O'Boyle. 2009. Mapping parallelism to multi-cores: A machine learning based approach. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'09). ACM, New York, NY, 75--84. http://dx.doi.org/10.1145/1504176.1504189 Google ScholarDigital Library
Kamen Yotov, Xiaoming Li, Gang Ren, Michael Cibulskis, Gerald DeJong, Maria Garzaran, David Padua, Keshav Pingali, Paul Stodghill, and Peng Wu. 2003. A comparison of empirical and model-driven optimization. In PLDI'03: Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation. ACM, New York, NY, 63--76. http://dx.doi.org/10.1145/781131.781140 Google ScholarDigital Library

Index Terms

Automatic feature generation for machine learning--based optimising compilation
1. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Automatic Feature Generation for Machine Learning Based Optimizing Compilation
CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization

Recent work has shown that machine learning can automate and in some cases outperform hand crafted compiler optimizations. Central to such an approach is that machine learning techniques typically rely upon summaries or features of the program. The ...
Read More
Classifier design with feature selection and feature extraction using layered genetic programming

This paper proposes a novel method called FLGP to construct a classifier device of capability in feature selection and feature extraction. FLGP is developed with layered genetic programming that is a kind of the multiple-population genetic programming. ...
Read More
Automatic Feature Engineering by Deep Reinforcement Learning
AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems

We present a framework calledLearning Automatic Feature Engineering Machine (LAFEM), which formalizes theFeature Engineering (FE) problem as an optimization problem over aHeterogeneous Transformation Graph (HTG). We propose a Deep Q-learning on HTG to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Architecture and Code Optimization Volume 11, Issue 1
February 2014
373 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2591460
Issue’s Table of Contents

Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 February 2014
- Accepted: 1 June 2009
- Revised: 1 March 2009
- Received: 1 February 2007
Published in taco Volume 11, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Feature generation
genetic programming
program optimisation
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 1,227
  Total Downloads
- Downloads (Last 12 months)239
- Downloads (Last 6 weeks)51
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic feature generation for machine learning--based optimising compilation

ACM Transactions on Architecture and Code Optimization

Abstract

References

Cited By

Index Terms

Recommendations

Automatic Feature Generation for Machine Learning Based Optimizing Compilation

Classifier design with feature selection and feature extraction using layered genetic programming

Automatic Feature Engineering by Deep Reinforcement Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automatic feature generation for machine learning--based optimising compilation

ACM Transactions on Architecture and Code Optimization

Abstract

References

Cited By

Index Terms

Recommendations

Automatic Feature Generation for Machine Learning Based Optimizing Compilation

Classifier design with feature selection and feature extraction using layered genetic programming

Automatic Feature Engineering by Deep Reinforcement Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media