research-article

Free Access

A decoupled non-SSA global register allocation using bipartite liveness graphs

Authors:
Rajkishore Barik

Intel Labs, CA, USA

Intel Labs, CA, USA
View Profile

,
Jisheng Zhao

Rice University, TX, USA

Rice University, TX, USA
View Profile

,
Vivek Sarkar

Rice University, TX, USA

Rice University, TX, USA
View Profile

ACM Transactions on Architecture and Code Optimization Volume 10 Issue 4Article No.: 63pp 1–24https://doi.org/10.1145/2544101

Published:01 December 2013Publication History

ACM Transactions on Architecture and Code Optimization

Abstract

Register allocation is an essential optimization for all compilers. A number of sophisticated register allocation algorithms have been developed over the years. The two fundamental classes of register allocation algorithms used in modern compilers are based on Graph Coloring (GC) and Linear Scan (LS). However, these two algorithms have fundamental limitations in terms of precision. For example, the key data structure used in GC-based algorithms, the interference graph, lacks information on the program points at which two variables may interfere. The LS-based algorithms make local decisions regarding spilling, and thereby trade off global optimization for reduced compile-time and space overheads. Recently, researchers have proposed Static Single Assignment (SSA)-based decoupled register allocation algorithms that exploit the live-range split points of the SSA representation to optimally solve the spilling problem. However, SSA-based register allocation often requires extra complexity in repairing register assignments during SSA elimination and in addressing architectural constraints such as aliasing and ABI encoding; this extra overhead can be prohibitively expensive in dynamic compilation contexts.

This article proposes a decoupled non-SSA--based global register allocation algorithm for dynamic compilation. It addresses the limitations in current algorithms by introducing a Bipartite Liveness Graph (BLG)-based register allocation algorithm that models the spilling phase as an optimization problem on the BLG itself and the assignment phase as a separate optimization problem. Advanced register allocation optimizations such as move coalescing, live-range splitting, and register class handling are also performed along with the spilling and assignment phases. In the presence of register classes, we propose a bucket-based greedy heuristic for assignment that strikes a balance between spill-cost and register class constraints. We present experimental evaluation of our BLG-based register allocation algorithm and compare it with production-quality register allocators in Jikes RVM and LLVM.

References

Appel, A. W. and George, L. 2001. Optimal spilling for CISC machines with few registers. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation (PLDI’01). ACM, New York, NY, 243--253. Google ScholarDigital Library
Bergner, P., Dahl, P., Engebretsen, D., and K., O’Keefe, M. 1997. Spill code minimization via interference region spilling. SIGPLAN Not. 32, 5, 287--295. Google ScholarDigital Library
Blackburn, S. M., Garner, R., Hoffman, C., Khan, A. M., McKinley, K. S., Bentzur, R., Diwan, A., Feinberg, D., Frampton, D., Guyer, S. Z., Hirzel, M., Hosking, A., Jump, M., Lee, H., Moss, J. E. B., Phansalkar, A., Stefanović, D., VanDrunen, T., von Dincklage, D., and Wiedermann, B. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programing, Systems, Languages, and Applications (OOPSLA’06). ACM Press, New York, NY, 169--190. Google ScholarDigital Library
Boissinot, B., Darte, A., Rastello, F., de Dinechin, B. D., and Guillon, C. 2009. Revisiting out-of-SSA translation for correctness, code quality and efficiency. In Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’09). IEEE Computer Society, Washington, DC, 114--125. Google ScholarDigital Library
Bouchez, F. 2009. A Study of Spilling and Coalescing in Register Allocation as Two Separate Phases. PhD Dissertation.Google Scholar
Bouchez, F., Darte, A., and Rastello, F. 2007. On the complexity of register coalescing. In Proceedings of the Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’07). IEEE Computer Society, Washington, DC, 102--114. Google ScholarDigital Library
Braun, M., Mallon, C., and Hack, S. 2010. Preference-guided register assignment. In Compiler Construction 2010. Lecture Notes in Computer Science, vol. 6011. Springer, 205--223. Google ScholarDigital Library
Briggs, P., Cooper, K. D., Kennedy, K., and Torczon, L. 1989. Coloring heuristics for register allocation. In Proceedings of the 1989 SIGPLAN Conference on Programming Language Design and Implementation. 24, 7, 275--284. Google ScholarDigital Library
Briggs, P., Cooper, K. D., and Torczon, L. 1992. Rematerialization, In Proceedings of the ACM SIGPLAN 1992 Conference on Programming Language Design and Implementation (PLDI’92). Google ScholarDigital Library
Briggs, P., Cooper, K. D., and Torczon, L. 1994. Improvements to graph coloring register allocation. ACM Transactions on Programming Languages and Systems 16, 3, 428--455. Google ScholarDigital Library
Brisk, P. 2006. Advances in Static Single Assignment Form and Register Allocation. PhD Dissertation. Los Angeles, CA. Google ScholarDigital Library
Brisk, P., Dabiri, F., Macbeth, J., and Sarrafzadeh, M. 2005. Polynomial time graph coloring register allocation. In Proceedings of the 14th International Workshop on Logic and Synthesis.Google Scholar
Budimlic, Z., Cooper, K. D., Harvey, T. J., Kennedy, K., Oberg, T. S., and Reeves, S. W. 2002. Fast copy coalescing and live-range identification. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI’02). ACM, New York, NY, 25--32. Google ScholarDigital Library
Callahan, D. and Koblenz, B. 1991. Register allocation via hierarchical graph coloring. In Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation (PLDI’91). ACM Press, New York, NY, 192--203. Google ScholarDigital Library
Chaitin, G. J., Auslander, M., Chandra, A., Cocke, J., Hopkins, M., and Markstein, P. 1981. Register allocation via coloring. Computer Languages 6, 47--57. Google ScholarCross Ref
Colombet, Q., Boissinot, B., Brisk, P., Hack, S., and Rastello, F. 2011. Graph-coloring and treescan register allocation using repairing. In Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’11). ACM, New York, NY, 45--54. Google ScholarDigital Library
Cooper, K. D. and Dasgupta, A. 2006. Tailoring graph-coloring register allocation for runtime compilation. In Proceedings of the Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’06). IEEE Computer Society, Washington, DC, 39--49. Google ScholarDigital Library
Eisenbeis, C., Lelait, S., and Marmol, B. 1995. The meeting graph: A new model for loop cyclic register allocation. In Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT’95). IFIP Working Group on Algol, Manchester, UK, 264--267. Google ScholarDigital Library
EPCC. 2001. The Java Grande Forum Benchmark Suite. Retrieved from http://www.epcc.ed.ac.uk/javagrande/javag.html.Google Scholar
Evlogimenos, A. 2004. Improvements to Linear Scan Register Allocation. Technical Report. University of California at Urbana-Champaign.Google Scholar
George, L. and Appel, A. W. 1996. Iterated register coalescing. ACM Transactions on Programming Languages and Systems 18, 3, 300--324. Google ScholarDigital Library
Georges, A., Buytaert, D., and Eeckhout, L. 2007. Statistically rigorous Java performance evaluation. In Proceedings of the 22nd Annual ACM SIGPLAN Conference on Object Oriented Programming Systems and Applications (OOPSLA’07). ACM, New York, NY, 57--76. Google ScholarDigital Library
Gupta, R., Soffa, M. L., and Ombres, D. 1994. Efficient register allocation via coloring using clique separators. ACM Trans. Program. Lang. Syst. 16, 3, 370--386. Google ScholarDigital Library
Hack, S. and Goos, G. 2006. Optimal register allocation for SSA-form programs in polynomial time. Inf. Process. Lett. 98, 4, 150--155. Google ScholarDigital Library
Hack, S. and Goos, G. 2008. Copy coalescing by graph recoloring. In Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI’08). ACM, New York, NY, 227--237. Google ScholarDigital Library
Jikes RVM. 2011. Homepage. Retrieved from http://jikesrvm.org/.Google Scholar
Jonathan Lee, K., Palsberg, J., and Pereira, F. M. 2007. Aliased register allocation for straight-line programs is NP-complete. In Automata, Languages and Programming. 680--691. Google ScholarDigital Library
Kotzmann, T., Wimmer, C., Mössenböck, H. E., Rodriguez, T., Russell, K., and Cox, D. 2008. Design of the Java HotSpot#8482;client compiler for Java 6. ACM Trans. Archit. Code Optim. 5, 1, 1--32. Google ScholarDigital Library
Lattner, C. 2009. The LLVM Compiler Infrastructure. Retrieved from http://llvm.org/.Google Scholar
Lueh, G.-Y., Gross, T., and Adl-Tabatabai, A.-R. 2000. Fusion-based register allocation. ACM Trans. Program. Lang. Syst. 22, 431--470. Issue 3. Google ScholarDigital Library
Park, J. and Moon, S.-M. 1998. Optimistic register coalescing. In Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques (PACT’98). J.-L. Gaudiot, Ed., 196--204. Google ScholarDigital Library
Pereira F. M. and Palsberg, J. 2005. Register allocation via coloring of chordal graphs. In Proceedings of the 3rd Asian Symposium on Programming Languages and Systems (APLAS’05). 315--329. Google ScholarDigital Library
Pereira F. M. and Palsberg, J. 2008. Register allocation by puzzle solving. In Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI’08). ACM, New York, NY, 216--226. Google ScholarDigital Library
Pereira F. M. and Palsberg, J. 2009. SSA Elimination after register allocation. In Proceedings of the 18th International Conference on Compiler Construction (CC’09). Springer-Verlag, Berlin, 158--173. Google ScholarDigital Library
Pereira F. M. Q. and Palsberg, J. 2010. Punctual coalescing. In Proceedings of the 19th Joint European Conference on Theory and Practice of Software, International Conference on Compiler Construction (CC’10/ETAPS’10). Springer-Verlag, Berlin, 165--184. Google ScholarDigital Library
Poletto, M. and Sarkar, V. 1999. Linear scan register allocation. ACM Transactions on Programming Languages and Systems 21, 5, 895--913. Google ScholarDigital Library
Ramsey, N. and Holloway, G. 2004. A generalized algorithm for graph-coloring register allocation. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation (PLDI’04). ACM, New York, NY, 277--288. Google ScholarDigital Library
Rong, H. 2009. Tree register allocation. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’09). ACM, New York, NY, 67--77. Google ScholarDigital Library
Sarkar, V. and Barik, R. 2007. Extended linear scan: An alternate foundation for global register allocation. In Proceedings of the 16th International Conference on Compiler Construction (CC’07). Google ScholarDigital Library
Thammanur, S. and Pande, S. 2004. A fast, memory-efficient register allocation framework for embedded systems. ACM Trans. Program. Lang. Syst. 26, 6, 938--974. Google ScholarDigital Library
The Standard Performance Evaluation Corporation. 2006. SPEC CPU2006 Benchmarks. Retrieved from http://www.spec.org/cpu2006/.Google Scholar
Traub, O., Holloway, G. H., and Smith, M. D. 1998. Quality and speed in linear-scan register allocation. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation (PLDI’98). 142--151. Google ScholarDigital Library
Wimmer, C. and Franz, M. 2010. Linear scan register allocation on SSA form. In Proceedings of the Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’10). ACM, New York, NY, 170--179. Google ScholarDigital Library
Wimmer, C. and Mössenböck, H. 2005. Optimized interval splitting in a linear scan register allocator. In Proceedings of the 1st ACM/USENIX International Conference on Virtual Execution Environments (VEE’05). ACM, New York, NY, 132--141. Google ScholarDigital Library
Zhang, K., Zhang, T., and Pande, S. 2004. Binary translation to improve energy efficiency through post-pass register re-allocation. In Proceedings of the 4th ACM International Conference On Embedded Software (EMSOFT’04). ACM, New York, NY, 74--85. Google ScholarDigital Library

Index Terms

A decoupled non-SSA global register allocation using bipartite liveness graphs
1. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Differential register allocation
PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation

Micro-architecture designers are very cautious about expanding the number of architected registers (also the register field), because increasing the register field adds to the code size, raises I-cache and memory pressure, complicates processor ...
Read More
Differential register allocation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation

Micro-architecture designers are very cautious about expanding the number of architected registers (also the register field), because increasing the register field adds to the code size, raises I-cache and memory pressure, complicates processor ...
Read More
SSA-based register allocation with PBQP
CC'11/ETAPS'11: Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software

Recent research shows that maintaining SSA form allows to split register allocation into separate phases: spilling, register assignment and copy coalescing. After spilling, register assignment can be done in polynomial time, but copy coalescing is NP-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Architecture and Code Optimization Volume 10, Issue 4
December 2013
1046 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2541228
Issue’s Table of Contents

Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 December 2013
- Accepted: 1 November 2013
- Revised: 1 September 2013
- Received: 1 July 2013
Published in taco Volume 10, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Decoupled register allocation
basic and compound intervals
bipartite graph
coalesce graph
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 448
  Total Downloads
- Downloads (Last 12 months)56
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A decoupled non-SSA global register allocation using bipartite liveness graphs

ACM Transactions on Architecture and Code Optimization

Abstract

References

Cited By

Index Terms

Recommendations

Differential register allocation

Differential register allocation

SSA-based register allocation with PBQP

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A decoupled non-SSA global register allocation using bipartite liveness graphs

ACM Transactions on Architecture and Code Optimization

Abstract

References

Cited By

Index Terms

Recommendations

Differential register allocation

Differential register allocation

SSA-based register allocation with PBQP

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media