skip to main content
research-article
Free Access

Automatic feature generation for machine learning--based optimising compilation

Published:01 February 2014Publication History
Skip Abstract Section

Abstract

Recent work has shown that machine learning can automate and in some cases outperform handcrafted compiler optimisations. Central to such an approach is that machine learning techniques typically rely upon summaries or features of the program. The quality of these features is critical to the accuracy of the resulting machine learned algorithm; no machine learning method will work well with poorly chosen features. However, due to the size and complexity of programs, theoretically there are an infinite number of potential features to choose from. The compiler writer now has to expend effort in choosing the best features from this space. This article develops a novel mechanism to automatically find those features that most improve the quality of the machine learned heuristic. The feature space is described by a grammar and is then searched with genetic programming and predictive modelling. We apply this technique to loop unrolling in GCC 4.3.1 and evaluate our approach on a Pentium 6. On a benchmark suite of 57 programs, GCCs hard-coded heuristic achieves only 3% of the maximum performance available, whereas a state-of-the-art machine learning approach with hand-coded features obtains 59%. Our feature generation technique is able to achieve 76% of the maximum available speedup, outperforming existing approaches.

References

  1. Felix Agakov, Edwin Bonilla, John Cavazos, Bjorn Franke, Grigori Fursin, Michael F. P. O'Boyle, John Thomson, Marc Toussaint, and Christopher K. I. Williams. 2006. Using machine learning to focus iterative optimization. In CGO'06: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, DC, 295--305. http://dx.doi.org/10.1109/CGO.2006.37 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Boston, MA: Addison-Wesley Longman. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Thomas Back. 1996. Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford University Press, Oxford, UK. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Steven J. Beaty. 1991. Genetic algorithms and instruction scheduling. In MICRO 24: Proceedings of the 24th Annual International Symposium on Microarchitecture. ACM Press, New York, NY, 206--211. http://dx.doi.org/10.1145/123465.123507 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. John Cavazos and J. Eliot B. Moss. 2004. Inducing heuristics to decide whether to schedule. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation (PLDI'04). ACM, New York, NY, 183--194. http://dx.doi.org/10.1145/996841.996864 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. John Cavazos and Michael F. P. O'Boyle. 2006. Method-specific dynamic compilation using logistic regression. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA'06). ACM, New York, NY, 229--240. http://dx.doi.org/10.1145/1167473.1167492 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Keith D. Cooper, Alexander Grosul, Timothy J. Harvey, Steven Reeves, Devika Subramanian, Linda Torczon, and Todd Waterman. 2005. ACME: Adaptive compilation made efficient. In Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'05). ACM, New York, NY, 69--77. http://dx.doi.org/10.1145/1065910.1065921 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Keith D. Cooper, Philip J. Schielke, and Devika Subramanian. 1999. Optimizing for reduced code space using genetic algorithms. In Proceedings of the ACM SIGPLAN 1999 Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES'99). ACM, New York, NY, 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Keith D. Cooper, Devika Subramanian, and Linda Torczon. 2002. Adaptive optimizing compilers for the 21st century. Journal of Supercomputing 23, 1, 7--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Christophe Dubach, Timothy M. Jones, Edwin V. Bonilla, Grigori Fursin, and Michael F. P. O'Boyle. 2009. Portable compiler optimisation across embedded programs and microarchitectures using machine learning. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (Micro-42). ACM, 78--88. http://dx.doi.org/10.1145/1669112.1669124 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ron Kohavi and George H. John. 1997. Wrappers for feature subset selection. Artificial Intelligence 97, 1--2, 273--324. citeseer.ist.psu.edu/article/kohavi97wrappers.html. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. John R. Koza. 1990. Evolution and co-evolution of computer programs to control independent-acting agents. In From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, 24-28, September 1990, Jean-Arcady Meyer and Stewart W. Wilson (Eds.). MIT Press, Paris, France, 366--375. http://citeseer.ist.psu.edu/koza91evolution.html. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Prasad Kulkarni, Stephen Hines, Jason Hiser, David Whalley, Jack Davidson, and Douglas Jones. 2004. Fast searches for effective optimization phase sequences. In PLDI'04: Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation. ACM, New York, NY, 171--182. http://dx.doi.org/10.1145/996841.996863 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hugh Leather, Edwin Bonilla, and Michael O'Boyle. 2009. Automatic feature generation for machine learning based optimizing compilation. In CGO'09: Proceedings of the International Symposium on Code Generation and Optimization. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Amy McGovern and Eliot Moss. 1999. Scheduling straight-line code using reinforcement learning and rollouts. In Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems II. MIT Press, Cambridge, MA, 903--909. http://dl.acm.org/citation.cfm?id=340534.340836 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Antoine Monsifrot, Francois Bodin, and Rene Quiniou. 2002. A machine learning approach to automatic production of compiler heuristics. In Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications (AIMSA'02). Springer-Verlag, London, UK, 41--50. http://dl.acm.org/citation.cfm?id=646053.677574 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Zhelong Pan and Rudolf Eigenmann. 2006. Fast and effective orchestration of compiler optimizations for automatic performance tuning. In CGO'06: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, DC, 319--332. http://dx.doi.org/10.1109/CGO.2006.38 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Conor Ryan, J. J. Collins, and Michael O Neill. 1998. Grammatical evolution: Evolving programs for an arbitrary language. In Proceedings of the 1st European Workshop on Genetic Programming, Wolfgang Banzhaf, Riccardo Poli, Marc Schoenauer, and Terence C. Fogarty (Eds.), Vol. 1391. Springer-Verlag, Paris, 83--95. http://citeseer.ist.psu.edu/ryan98grammatical.html. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Bernhard Schölkopf and Alexandra J. Smola. 2001. Learning with Kernels: Support Vector Machines Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Mark Stephenson and Saman Amarasinghe. 2005. Predicting unroll factors using supervised classification. In CGO'05: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, DC, 123--134. http://dx.doi.org/10.1109/CGO.2005.29 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mark Stephenson, Saman Amarasinghe, Martin Martin, and Una-May O'Reilly. 2003. Meta optimization: Improving compiler heuristics with machine learning. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (PLDI'03). ACM, New York, NY, 77--90. http://dx.doi.org/10.1145/781131.781141 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Spyridon Triantafyllis, Manish Vachharajani, Neil Vachharajani, and David I. August. 2003. Compiler optimization-space exploration. - IEEE Journal of Instruction-Level Parallelism 7, 204--215. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Zheng Wang and Michael F. P. O'Boyle. 2009. Mapping parallelism to multi-cores: A machine learning based approach. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'09). ACM, New York, NY, 75--84. http://dx.doi.org/10.1145/1504176.1504189 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kamen Yotov, Xiaoming Li, Gang Ren, Michael Cibulskis, Gerald DeJong, Maria Garzaran, David Padua, Keshav Pingali, Paul Stodghill, and Peng Wu. 2003. A comparison of empirical and model-driven optimization. In PLDI'03: Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation. ACM, New York, NY, 63--76. http://dx.doi.org/10.1145/781131.781140 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic feature generation for machine learning--based optimising compilation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Architecture and Code Optimization
      ACM Transactions on Architecture and Code Optimization  Volume 11, Issue 1
      February 2014
      373 pages
      ISSN:1544-3566
      EISSN:1544-3973
      DOI:10.1145/2591460
      Issue’s Table of Contents

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 February 2014
      • Accepted: 1 June 2009
      • Revised: 1 March 2009
      • Received: 1 February 2007
      Published in taco Volume 11, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader