skip to main content
article

A framework for unrestricted whole-program optimization

Published: 11 June 2006 Publication History

Abstract

Procedures have long been the basic units of compilation in conventional optimization frameworks. However, procedures are typically formed to serve software engineering rather than optimization goals, arbitrarily constraining code transformations. Techniques, such as aggressive inlining and interprocedural optimization, have been developed to alleviate this problem, but, due to code growth and compile time issues, these can be applied only sparingly.This paper introduces the Procedure Boundary Elimination (PBE) compilation framework, which allows unrestricted whole-program optimization. PBE allows all intra-procedural optimizations and analyses to operate on arbitrary subgraphs of the program, regardless of the original procedure boundaries and without resorting to inlining. In order to control compilation time, PBE also introduces novel extensions of region formation and encapsulation. PBE enables targeted code specialization, which recovers the specialization benefits of inlining while keeping code growth in check. This paper shows that PBE attains better performance than inlining with half the code growth.

References

[1]
August, D. I., Connors, D. A., Mahlke, S. A., Sias, J. W., Crozier, K. M., Cheng, B., Eaton, P. R., Olaniran, Q. B., and Hwu, W. W. Integrated predication and speculative execution in the IMPACT EPIC architecture. In Proceedings of the 25th International Symposium on Computer Architecture (June 1998), pp. 227--237.
[2]
Ayers, A., Schooler, R., and Gottlieb, R. Aggressive inlining. In ACM SIGPLAN '97 Conference on Programming Language Design and Implementation (June 1997), pp. 134--145.
[3]
Bodik, R., Gupta, R., and Soffa, M. L. Complete removal of redundant computation. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation (June 1998), pp. 1--14.
[4]
Callahan, D., Cooper, K. D., Kennedy, K., and Torczon, L. Interprocedural constant propagation. In Proceedings of the ACM SIGPLAN'86 Symposium on Compiler Construction (July 1986), pp. 152--161.
[5]
Chang, P. P., Mahlke, S. A., Chen, W. Y., and Hwu, W. W. Profile-guided automatic inline expansion for C programs. Software Practice and Experience 22, 5 (May 1992), 349--370.
[6]
Eichenberger, A., Meleis, W., and Maradani, S. An integrated approach to accelerate data and predicate computations in hyperblocks. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture (November 2000), pp. 101--111.
[7]
Eranian, S. Perfmon: L inux performance monitoring for IA-64. http://www.hpl.hp.com/research/linux/perfmon/, 2003.
[8]
Goubault, J. Generalized boxings, congruences and partial inlining. In First International Static Analysis Symposium (Namur, Belgium, September 1994).
[9]
Hank, R. E., Hwu, W. W., and Rau, B. R. Region-based compilation: An introduction and motivation. In Proceedings of the 28th Annual International Symposium on Microarchitecture (December 1995), pp. 158--168.
[10]
Havanki, W. A. Treegion scheduling for VLIW processors. Master's thesis, Department of Computer Science, North Carolina State University, 1997.
[11]
Hwu, W. W., and Chang, P. P. Inline function expansion for compiling realistic C programs. In Proceedings of the ACM SIGPLAN 1989 Conference on Programming Language Design and Implementation (June 1989), pp. 246--257.
[12]
Hwu, W. W., Mahlke, S. A., Chen, W. Y., Chang, P. P., Warter, N. J., Bringmann, R. A., Ouellette, R. G., Hank, R. E., Kiyohara, T., Haab, G. E., Holm, J. G., and Lavery, D. M. The superblock: An effective technique for VLIW and superscalar compilation. The Journal of Supercomputing 7, 1 (January 1993), 229--248.
[13]
Knoop, J., and Steffen, B. The interprocedural coincidence theorem. In Proceedings of the 4th International Conference on Compiler Construction (Paderborn, Germany, October 1992), pp. 125--140.
[14]
Mahlke, S. A., Lin, D. C., Chen, W. Y., Hank, R. E., Bringmann, R. A., and Hwu, W. W. Effective compiler support for predicated execution using the hyperblock. In Proceedings of the 25th International Symposium on Microarchitecture (December 1992), pp. 45--54.
[15]
Myers, E. W. A precise inter-procedural data flow algorithm. In Proceedings of the 8th ACM symposium on Principles of programming languages (Jan. 1981), pp. 219--230.
[16]
Nystrom, E. M., Kim, H.-S., and Hwu, W.-M. Bottom-up and top-down context-sensitive summary-based pointer analysis. In Proceedings of the 11th Static Analysis Symposium (August 2004).
[17]
Reps, T., Horwitz, S., and Sagiv, M. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages (June 1995), pp. 49--61.
[18]
Santhanam, V., and Odnert, D. Register allocation across procedure and module boundaries. In Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation (June 1990), pp. 28--39.
[19]
Sharir, M., and Pnueli, A. Two approaches to interprocedural data flow analysis. In Program Flow Analysis: Theory and Applications, S. Muchnick and N. Jones, Eds. Prentice-Hall, Englewood Cliffs, NJ, 1981, pp. 189--233.
[20]
Suganuma, T., Yasue, T., and Nakatani, T. A region-based compilation technique for a java just-in-time compiler. In Proceedings of the ACM SIGPLAN 2003 conference on Programming Language Design and Implementation (June 2003), pp. 312--323.
[21]
Way, T., Breech, B., and Pollock, L. Region formation analysis with demand-driven inlining for region-based optimization. In Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques (May 2000), p. 24.

Cited By

View all
  • (2021)Enhancing the Effectiveness of Inlining in Automatic ParallelizationInternational Journal of Parallel Programming10.1007/s10766-021-00722-1Online publication date: 6-Aug-2021
  • (2019)An optimization-driven incremental inline substitution algorithm for just-in-time compilersProceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization10.5555/3314872.3314893(164-179)Online publication date: 16-Feb-2019
  • (2019)An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO.2019.8661171(164-179)Online publication date: Feb-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 41, Issue 6
Proceedings of the 2006 PLDI Conference
June 2006
426 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1133255
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI '06: Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2006
    438 pages
    ISBN:1595933204
    DOI:10.1145/1133981
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2006
Published in SIGPLAN Volume 41, Issue 6

Check for updates

Author Tags

  1. inlining
  2. interprocedural analysis
  3. interprocedural optimization
  4. path-sensitive analysis
  5. procedure unification
  6. region encapsulation
  7. region formation
  8. region-based compilation
  9. specialization
  10. superblock
  11. whole-program analysis
  12. whole-program optimization

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Enhancing the Effectiveness of Inlining in Automatic ParallelizationInternational Journal of Parallel Programming10.1007/s10766-021-00722-1Online publication date: 6-Aug-2021
  • (2019)An optimization-driven incremental inline substitution algorithm for just-in-time compilersProceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization10.5555/3314872.3314893(164-179)Online publication date: 16-Feb-2019
  • (2019)An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO.2019.8661171(164-179)Online publication date: Feb-2019
  • (2018)HHVM JIT: a profile-guided, region-based compiler for PHP and HackACM SIGPLAN Notices10.1145/3296979.319237453:4(151-165)Online publication date: 11-Jun-2018
  • (2018)HHVM JIT: a profile-guided, region-based compiler for PHP and HackProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3192366.3192374(151-165)Online publication date: 11-Jun-2018
  • (2017)Composite Software Diversification2017 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME.2017.61(284-294)Online publication date: Sep-2017
  • (2014)Region-Based Selective Flow-Sensitive Pointer AnalysisStatic Analysis10.1007/978-3-319-10936-7_20(319-336)Online publication date: 2014
  • (2012)Approximate graph clustering for program characterizationACM Transactions on Architecture and Code Optimization10.1145/2086696.20867008:4(1-21)Online publication date: 26-Jan-2012
  • (2011)Enhancing the Role of Inlining in Effective Interprocedural ParallelizationProceedings of the 2011 International Conference on Parallel Processing10.1109/ICPP.2011.68(265-274)Online publication date: 13-Sep-2011
  • (2010)An optimal linear-time algorithm for interprocedural register allocation in high level synthesis using SSA formIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2010.204906029:7(1096-1109)Online publication date: 1-Jul-2010
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media