Abstract
Register allocation is an essential optimization for all compilers. A number of sophisticated register allocation algorithms have been developed over the years. The two fundamental classes of register allocation algorithms used in modern compilers are based on Graph Coloring (GC) and Linear Scan (LS). However, these two algorithms have fundamental limitations in terms of precision. For example, the key data structure used in GC-based algorithms, the interference graph, lacks information on the program points at which two variables may interfere. The LS-based algorithms make local decisions regarding spilling, and thereby trade off global optimization for reduced compile-time and space overheads. Recently, researchers have proposed Static Single Assignment (SSA)-based decoupled register allocation algorithms that exploit the live-range split points of the SSA representation to optimally solve the spilling problem. However, SSA-based register allocation often requires extra complexity in repairing register assignments during SSA elimination and in addressing architectural constraints such as aliasing and ABI encoding; this extra overhead can be prohibitively expensive in dynamic compilation contexts.
This article proposes a decoupled non-SSA--based global register allocation algorithm for dynamic compilation. It addresses the limitations in current algorithms by introducing a Bipartite Liveness Graph (BLG)-based register allocation algorithm that models the spilling phase as an optimization problem on the BLG itself and the assignment phase as a separate optimization problem. Advanced register allocation optimizations such as move coalescing, live-range splitting, and register class handling are also performed along with the spilling and assignment phases. In the presence of register classes, we propose a bucket-based greedy heuristic for assignment that strikes a balance between spill-cost and register class constraints. We present experimental evaluation of our BLG-based register allocation algorithm and compare it with production-quality register allocators in Jikes RVM and LLVM.
- Appel, A. W. and George, L. 2001. Optimal spilling for CISC machines with few registers. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation (PLDI’01). ACM, New York, NY, 243--253. Google ScholarDigital Library
- Bergner, P., Dahl, P., Engebretsen, D., and K., O’Keefe, M. 1997. Spill code minimization via interference region spilling. SIGPLAN Not. 32, 5, 287--295. Google ScholarDigital Library
- Blackburn, S. M., Garner, R., Hoffman, C., Khan, A. M., McKinley, K. S., Bentzur, R., Diwan, A., Feinberg, D., Frampton, D., Guyer, S. Z., Hirzel, M., Hosking, A., Jump, M., Lee, H., Moss, J. E. B., Phansalkar, A., Stefanović, D., VanDrunen, T., von Dincklage, D., and Wiedermann, B. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programing, Systems, Languages, and Applications (OOPSLA’06). ACM Press, New York, NY, 169--190. Google ScholarDigital Library
- Boissinot, B., Darte, A., Rastello, F., de Dinechin, B. D., and Guillon, C. 2009. Revisiting out-of-SSA translation for correctness, code quality and efficiency. In Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’09). IEEE Computer Society, Washington, DC, 114--125. Google ScholarDigital Library
- Bouchez, F. 2009. A Study of Spilling and Coalescing in Register Allocation as Two Separate Phases. PhD Dissertation.Google Scholar
- Bouchez, F., Darte, A., and Rastello, F. 2007. On the complexity of register coalescing. In Proceedings of the Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’07). IEEE Computer Society, Washington, DC, 102--114. Google ScholarDigital Library
- Braun, M., Mallon, C., and Hack, S. 2010. Preference-guided register assignment. In Compiler Construction 2010. Lecture Notes in Computer Science, vol. 6011. Springer, 205--223. Google ScholarDigital Library
- Briggs, P., Cooper, K. D., Kennedy, K., and Torczon, L. 1989. Coloring heuristics for register allocation. In Proceedings of the 1989 SIGPLAN Conference on Programming Language Design and Implementation. 24, 7, 275--284. Google ScholarDigital Library
- Briggs, P., Cooper, K. D., and Torczon, L. 1992. Rematerialization, In Proceedings of the ACM SIGPLAN 1992 Conference on Programming Language Design and Implementation (PLDI’92). Google ScholarDigital Library
- Briggs, P., Cooper, K. D., and Torczon, L. 1994. Improvements to graph coloring register allocation. ACM Transactions on Programming Languages and Systems 16, 3, 428--455. Google ScholarDigital Library
- Brisk, P. 2006. Advances in Static Single Assignment Form and Register Allocation. PhD Dissertation. Los Angeles, CA. Google ScholarDigital Library
- Brisk, P., Dabiri, F., Macbeth, J., and Sarrafzadeh, M. 2005. Polynomial time graph coloring register allocation. In Proceedings of the 14th International Workshop on Logic and Synthesis.Google Scholar
- Budimlic, Z., Cooper, K. D., Harvey, T. J., Kennedy, K., Oberg, T. S., and Reeves, S. W. 2002. Fast copy coalescing and live-range identification. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI’02). ACM, New York, NY, 25--32. Google ScholarDigital Library
- Callahan, D. and Koblenz, B. 1991. Register allocation via hierarchical graph coloring. In Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation (PLDI’91). ACM Press, New York, NY, 192--203. Google ScholarDigital Library
- Chaitin, G. J., Auslander, M., Chandra, A., Cocke, J., Hopkins, M., and Markstein, P. 1981. Register allocation via coloring. Computer Languages 6, 47--57. Google ScholarCross Ref
- Colombet, Q., Boissinot, B., Brisk, P., Hack, S., and Rastello, F. 2011. Graph-coloring and treescan register allocation using repairing. In Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’11). ACM, New York, NY, 45--54. Google ScholarDigital Library
- Cooper, K. D. and Dasgupta, A. 2006. Tailoring graph-coloring register allocation for runtime compilation. In Proceedings of the Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’06). IEEE Computer Society, Washington, DC, 39--49. Google ScholarDigital Library
- Eisenbeis, C., Lelait, S., and Marmol, B. 1995. The meeting graph: A new model for loop cyclic register allocation. In Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT’95). IFIP Working Group on Algol, Manchester, UK, 264--267. Google ScholarDigital Library
- EPCC. 2001. The Java Grande Forum Benchmark Suite. Retrieved from http://www.epcc.ed.ac.uk/javagrande/javag.html.Google Scholar
- Evlogimenos, A. 2004. Improvements to Linear Scan Register Allocation. Technical Report. University of California at Urbana-Champaign.Google Scholar
- George, L. and Appel, A. W. 1996. Iterated register coalescing. ACM Transactions on Programming Languages and Systems 18, 3, 300--324. Google ScholarDigital Library
- Georges, A., Buytaert, D., and Eeckhout, L. 2007. Statistically rigorous Java performance evaluation. In Proceedings of the 22nd Annual ACM SIGPLAN Conference on Object Oriented Programming Systems and Applications (OOPSLA’07). ACM, New York, NY, 57--76. Google ScholarDigital Library
- Gupta, R., Soffa, M. L., and Ombres, D. 1994. Efficient register allocation via coloring using clique separators. ACM Trans. Program. Lang. Syst. 16, 3, 370--386. Google ScholarDigital Library
- Hack, S. and Goos, G. 2006. Optimal register allocation for SSA-form programs in polynomial time. Inf. Process. Lett. 98, 4, 150--155. Google ScholarDigital Library
- Hack, S. and Goos, G. 2008. Copy coalescing by graph recoloring. In Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI’08). ACM, New York, NY, 227--237. Google ScholarDigital Library
- Jikes RVM. 2011. Homepage. Retrieved from http://jikesrvm.org/.Google Scholar
- Jonathan Lee, K., Palsberg, J., and Pereira, F. M. 2007. Aliased register allocation for straight-line programs is NP-complete. In Automata, Languages and Programming. 680--691. Google ScholarDigital Library
- Kotzmann, T., Wimmer, C., Mössenböck, H. E., Rodriguez, T., Russell, K., and Cox, D. 2008. Design of the Java HotSpot#8482;client compiler for Java 6. ACM Trans. Archit. Code Optim. 5, 1, 1--32. Google ScholarDigital Library
- Lattner, C. 2009. The LLVM Compiler Infrastructure. Retrieved from http://llvm.org/.Google Scholar
- Lueh, G.-Y., Gross, T., and Adl-Tabatabai, A.-R. 2000. Fusion-based register allocation. ACM Trans. Program. Lang. Syst. 22, 431--470. Issue 3. Google ScholarDigital Library
- Park, J. and Moon, S.-M. 1998. Optimistic register coalescing. In Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques (PACT’98). J.-L. Gaudiot, Ed., 196--204. Google ScholarDigital Library
- Pereira F. M. and Palsberg, J. 2005. Register allocation via coloring of chordal graphs. In Proceedings of the 3rd Asian Symposium on Programming Languages and Systems (APLAS’05). 315--329. Google ScholarDigital Library
- Pereira F. M. and Palsberg, J. 2008. Register allocation by puzzle solving. In Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI’08). ACM, New York, NY, 216--226. Google ScholarDigital Library
- Pereira F. M. and Palsberg, J. 2009. SSA Elimination after register allocation. In Proceedings of the 18th International Conference on Compiler Construction (CC’09). Springer-Verlag, Berlin, 158--173. Google ScholarDigital Library
- Pereira F. M. Q. and Palsberg, J. 2010. Punctual coalescing. In Proceedings of the 19th Joint European Conference on Theory and Practice of Software, International Conference on Compiler Construction (CC’10/ETAPS’10). Springer-Verlag, Berlin, 165--184. Google ScholarDigital Library
- Poletto, M. and Sarkar, V. 1999. Linear scan register allocation. ACM Transactions on Programming Languages and Systems 21, 5, 895--913. Google ScholarDigital Library
- Ramsey, N. and Holloway, G. 2004. A generalized algorithm for graph-coloring register allocation. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation (PLDI’04). ACM, New York, NY, 277--288. Google ScholarDigital Library
- Rong, H. 2009. Tree register allocation. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’09). ACM, New York, NY, 67--77. Google ScholarDigital Library
- Sarkar, V. and Barik, R. 2007. Extended linear scan: An alternate foundation for global register allocation. In Proceedings of the 16th International Conference on Compiler Construction (CC’07). Google ScholarDigital Library
- Thammanur, S. and Pande, S. 2004. A fast, memory-efficient register allocation framework for embedded systems. ACM Trans. Program. Lang. Syst. 26, 6, 938--974. Google ScholarDigital Library
- The Standard Performance Evaluation Corporation. 2006. SPEC CPU2006 Benchmarks. Retrieved from http://www.spec.org/cpu2006/.Google Scholar
- Traub, O., Holloway, G. H., and Smith, M. D. 1998. Quality and speed in linear-scan register allocation. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation (PLDI’98). 142--151. Google ScholarDigital Library
- Wimmer, C. and Franz, M. 2010. Linear scan register allocation on SSA form. In Proceedings of the Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’10). ACM, New York, NY, 170--179. Google ScholarDigital Library
- Wimmer, C. and Mössenböck, H. 2005. Optimized interval splitting in a linear scan register allocator. In Proceedings of the 1st ACM/USENIX International Conference on Virtual Execution Environments (VEE’05). ACM, New York, NY, 132--141. Google ScholarDigital Library
- Zhang, K., Zhang, T., and Pande, S. 2004. Binary translation to improve energy efficiency through post-pass register re-allocation. In Proceedings of the 4th ACM International Conference On Embedded Software (EMSOFT’04). ACM, New York, NY, 74--85. Google ScholarDigital Library
Index Terms
- A decoupled non-SSA global register allocation using bipartite liveness graphs
Recommendations
Differential register allocation
PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementationMicro-architecture designers are very cautious about expanding the number of architected registers (also the register field), because increasing the register field adds to the code size, raises I-cache and memory pressure, complicates processor ...
Differential register allocation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementationMicro-architecture designers are very cautious about expanding the number of architected registers (also the register field), because increasing the register field adds to the code size, raises I-cache and memory pressure, complicates processor ...
SSA-based register allocation with PBQP
CC'11/ETAPS'11: Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of softwareRecent research shows that maintaining SSA form allows to split register allocation into separate phases: spilling, register assignment and copy coalescing. After spilling, register assignment can be done in polynomial time, but copy coalescing is NP-...
Comments