skip to main content
article
Open access

A compiler framework for speculative optimizations

Published: 01 September 2004 Publication History

Abstract

Speculative execution, such as control speculation or data speculation, is an effective way to improve program performance. Using edge/path profile information or simple heuristic rules, existing compiler frameworks can adequately incorporate and exploit control speculation. However, very little has been done so far to allow existing compiler frameworks to incorporate and exploit data speculation effectively in various program transformations beyond instruction scheduling. This paper proposes a speculative static single assignment form to incorporate information from alias profiling and/or heuristic rules for data speculation, thus allowing existing frameworks to be extended to support both control and data speculation. Such a general framework is very useful for EPIC architectures that provide run-time checking (such as advanced load address table) on data speculation to guarantee the correctness of program execution. We use SSAPRE as one example to illustrate how to incorporate data speculation in partial redundancy elimination, register promotion, and strength reduction. Our extended framework allows both control and data speculations to be performed on top of SSAPRE and, thus, enables more aggressive speculative optimizations. The proposed framework has been implemented on Intel's Open Research Compiler. We present experimental data on some SPEC2000 benchmark programs to demonstrate the usefulness of this framework.

References

[1]
Ball, T. and Larus, J. 1993. Branch prediction for free. In Proceedings of the ACM SIGPLAN Symposium on Programming Language Design and Implementation. 300--313.
[2]
Bodík, R., Gupta, R., and Soffa, M. 1998. Complete removal of redundant expressions. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 1--14.
[3]
Bodík, R., Gupta, R., and Soffa, M. 1999. Load-reuse analysis: design and evaluation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 64--76.
[4]
Chen, T., Lin, J., Hsu, W., and Yew, P. C. 2002. An empirical study on the granularity of pointer analysis in C programs. In 15th Workshop on Languages and Compilers for Parallel Computing. 151--160.
[5]
Chen, T., Lin, J., Hsu, W., and Yew, P. C. 2004. Data dependence profiling for speculative optimization. In The Proceedings of the 14th International Conference on Compiler Construction. 57--62.
[6]
Chow, F., Chan, S., Liu, S., Lo, R., and Streich, M. 1996. Effective representation of aliases and indirect memory operations in SSA form. In Proceedings of the Sixth International Conference on Compiler Construction. 253--267.
[7]
Chow, F., Chan, S., Kennedy, R., Liu, S., Lo, R., and Tu, P. 1997. A new algorithm for partial redundancy elimination based on SSA form. In Proceedings of the ACM SICPLAN Conference on Programming Language Design and Implementation. 273--286.
[8]
Cytron, R., Ferrante, J., Rosen, B., Wegman, M., and Zadeck, K. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst. 13, 4, 451--490.
[9]
Dhamdhere, D. M. 1991. Practical adaptation of the global optimization algorithm of morel and renovise, ACM Trans. Program. Lang. Syst. 13, 2, 291--294.
[10]
Diwan, A., McKinley, K., and Moss, J. 1998. Type-based alias analysis. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 106--117.
[11]
Dulong, C., Krishnaiyer, R., Kulkarni, D., Lavery, D., Li, W., Ng, J., and Sehr, D. 1999. An overview of the Intel IA-64 compiler. Intel Technology Journal.
[12]
Fernande, M. and Espasa, R. 2002. Speculative alias analysis for executable code. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques. 222--231.
[13]
Ghiya, R., Lavery, D., and Sehr, D. 2001. On the importance of points-to analysis and other memory disambiguation methods for C programs. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation. 47--58.
[14]
Hind, M. 2001. Pointer analysis: Haven't we solved this problem yet? In ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering. 54--61.
[15]
Hwang, Y.-S., Chen, P.-S., Lee, J.-K., and Ju, R. D.-C. 2001. Probabilistic points-to analysis. In Proceedings of the Workshop of Languages and Compilers for Parallel Computing.
[16]
Intel Corp. 1999. IA-64 Application Developer's Architecture Guide, Available at http://developer.intel.com/design/ia64/downloads/adag.htm.
[17]
Ishizaki, K., Inagaki, T., Komatsu, H., and Nakatani, T. 2002. Eliminating exception constraints of Java programs for IA-64. In Proceedings of the Eleventh International Conference on Parallel Architectures and Compilation Techniques. 259--268.
[18]
Ju, R. D.-C., Collard, J., and Oukbir, K. 1999. Probabilistic memory disambiguation and its application to data speculation. Computer Architecture News 27, 1.
[19]
Ju, R. D.-C., Nomura, K., Mahadevan, U., and Wu, L.-C. 2000. A unified compiler framework for control and data speculation. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 157--168.
[20]
Ju, R. D.-C., Chan, S., and Wu, C. 2001. Open Research Compiler (ORC) for the Itanium Processor Family. Tutorial presented at Micro 34.
[21]
Kawahito, M., Komatsu, H., and Nakatani, T. 2000. Effective null pointer check elimination utilizing hardware trap. In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems.
[22]
Kennedy, R., Chow, F., Dahl, P., Liu, S.-M., Lo, R., and Streich, M. 1998. Strength reduction via SSAPRE. In Proceedings of the Seventh International Conference on Compiler Construction. 144--158.
[23]
Kennedy, R., Chan, S., Liu, S., Lo, R., Tu, P., and Chow, F. 1999. Partial redundancy elimination in SSA form. ACM Trans. Program. Lang. Syst. 21, 3, 627--676.
[24]
Knobe, K. and Sarkar, V. 1998. Array SSA form and its use in parallelization. In Proceedings of ACM Symposium on Principles of Programming Languages. 107--120.
[25]
Knoop, J., Ruthing, O., and Steffen, B. 1992. Lazy code motion. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 224--234.
[26]
Lin, J., Chen, T., Hsu, W. C., and Yew, P. C. 2003. Speculative register promotion using advanced load address table (ALAT). In Proceedings of First Annual IEEE/ACM International Symposium on Code Generation and Optimization. 125--134.
[27]
Lin, J., Chen, T., Hsu, W. C., Yew, P. C., Ju, R. D.-C., Ngai, T. F., and Chan, S. 2003. A compiler framework for speculative analysis and optimizations. In Proceedings of ACM SIGPLAN on Programming Language Design and Implementation. 289--299.
[28]
Lo, R., Chow, F., Kennedy, R., Liu, S., and Tu, P. 1998. Register promotion by sparse partial redundancy elimination of loads and stores. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 26--37.
[29]
Morel, E. and Renvoise, C. 1979. Global optimization by suppression of partial redundancies. Commun. ACM 22, 2, 96--103.
[30]
Nicolau, A. 1989. Run-time disambiguation: Coping with statically unpredictable dependencies. IEEE Trans. Comput. 38, 5, 663--678.
[31]
Pfmon. 2003. Available from ftp://ftp.hpl.hp.com/pub/linux-ia64/pfmon-1.1-0.ia64.rpm.
[32]
Steensgaard, B. 1996. Points-to analysis in almost linear time. In Proceedings of ACM Symposium on Principles of Programming Languages. 32--41.
[33]
Wilson, R. P. and Lam, M. S. 1995. Efficient context-sensitive pointer analysis for C program. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 1--12.
[34]
Wu, Y. and Lee, Y. 2000. Accurate invalidation profiling for effective data speculation on EPIC processors. In 13th International Conference on Parallel and Distributed Computing Systems.

Cited By

View all
  • (2023)Efficient Interprocedural Data-Flow Analysis Using Treedepth and TreewidthVerification, Model Checking, and Abstract Interpretation10.1007/978-3-031-24950-1_9(177-202)Online publication date: 17-Jan-2023
  • (2020)Optimal and Perfectly Parallel Algorithms for On-demand Data-Flow AnalysisProgramming Languages and Systems10.1007/978-3-030-44914-8_5(112-140)Online publication date: 27-Apr-2020
  • (2019)Faster Algorithms for Dynamic Algebraic Queries in Basic RSMs with Constant TreewidthACM Transactions on Programming Languages and Systems10.1145/336352541:4(1-46)Online publication date: 13-Nov-2019
  • Show More Cited By

Index Terms

  1. A compiler framework for speculative optimizations

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 1, Issue 3
    September 2004
    121 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/1022969
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 September 2004
    Published in TACO Volume 1, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Data speculation
    2. partial redundancy elimination
    3. register promotion
    4. speculative SSA form
    5. speculative weak update

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)113
    • Downloads (Last 6 weeks)12
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Efficient Interprocedural Data-Flow Analysis Using Treedepth and TreewidthVerification, Model Checking, and Abstract Interpretation10.1007/978-3-031-24950-1_9(177-202)Online publication date: 17-Jan-2023
    • (2020)Optimal and Perfectly Parallel Algorithms for On-demand Data-Flow AnalysisProgramming Languages and Systems10.1007/978-3-030-44914-8_5(112-140)Online publication date: 27-Apr-2020
    • (2019)Faster Algorithms for Dynamic Algebraic Queries in Basic RSMs with Constant TreewidthACM Transactions on Programming Languages and Systems10.1145/336352541:4(1-46)Online publication date: 13-Nov-2019
    • (2017)Constructing HPSSA over SSAProceedings of the 20th International Workshop on Software and Compilers for Embedded Systems10.1145/3078659.3078660(31-40)Online publication date: 12-Jun-2017
    • (2012)Efficient and accurate data dependence profiling using software signaturesProceedings of the Tenth International Symposium on Code Generation and Optimization10.1145/2259016.2259041(186-195)Online publication date: 31-Mar-2012
    • (2012)Runtime analysis of application binaries for function level parallelism potential using QEMU2012 International Conference on Open Source Systems and Technologies10.1109/ICOSST.2012.6472824(33-39)Online publication date: Dec-2012
    • (2011)Dynamic register promotion of stack variablesProceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization10.5555/2190025.2190050(21-31)Online publication date: 2-Apr-2011
    • (2011)SeekBin: An automated tool for analyzing thread level speculative parallelization potential2011 7th International Conference on Emerging Technologies10.1109/ICET.2011.6048489(1-6)Online publication date: Sep-2011
    • (2011)Dynamic register promotion of stack variablesInternational Symposium on Code Generation and Optimization (CGO 2011)10.1109/CGO.2011.5764671(21-31)Online publication date: Apr-2011
    • (2011)HPC Systems at JAIST and Development of Dynamic Loop Monitoring Tools Toward Runtime ParallelizationHigh Performance Computing on Vector Systems 201110.1007/978-3-642-22244-3_5(65-78)Online publication date: 11-Oct-2011
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media