ABSTRACT
On today's microprocessors, there often exist several different types of registers, e.g. general purpose registers and floating point registers. A given program may use one type of registers much more frequently than other types. This creates an opportunity to employ the infrequently used registers as spill destinations for the more frequently used register types. In this paper, we present a code optimization method named idle register exploitation (IRE) to exploit such opportunities. We developed a model, called the IRE model, or IREM, to determine the static performance gains of IRE versus spilling to the stack. On a microprocessor with fast data paths between different types of registers, we find that IRE method speeds up the execution of the SPECint benchmark suite from 1.7% to 10%. In contrast, on microprocessors with less efficient data transfer paths, the performance gain is limited. In some cases, performance may even suffer degradation. This result argues strongly for the adoption of fast data paths between different types of registers for the purpose of reducing register spills, which is important in view of the increased significance of memory bottlenecks on future microprocessors.
- G. J. Chaitin, Register allocation & spilling via graph coloring, Proceedings of the 1982 SIGPLAN symposium on Compiler construction, p.98--105, June 23--25, 1982, Boston, Massachusetts, United States Google ScholarDigital Library
- D. Bernstein , M. Golumbic , y. Mansour , R. Pinter , D. Goldin , H. Krawczyk , I. Nahshon, Spill code minimization techniques for optimizing compliers, Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation, p.258--263, June 19--23, 1989, Portland, Oregon, United States Google ScholarDigital Library
- Fred C. Chow , John L. Hennessy, The priority--based coloring approach to register allocation, ACM Transactions on Programming Languages and Systems (TOPLAS), v.12 n.4, p.501--536, Oct. 1990 Google ScholarDigital Library
- P. Briggs , K. D. Cooper , K. Kennedy , L. Torczon, Coloring heuristics for register allocation, Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation, p.275--284, June 19--23, 1989, Portland, Oregon, United States Google ScholarDigital Library
- Briggs, P. Register Allocation via Graph Coloring. PhD thesis, Rice University, Apr. 1992. Google ScholarDigital Library
- Preston Briggs , Keith D. Cooper , Linda Torczon, Improvements to graph coloring register allocation, ACM Transactions on Programming Languages and Systems (TOPLAS), v.16 n.3, p.428--455, May 1994 Google ScholarDigital Library
- S. Subramanya Sastry , Subbarao Palacharla , James E. Smith, Exploiting idle floating--point resources for integer execution, Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, p.118--129, June 17--19, 1998, Montreal, Quebec, Canada Google ScholarDigital Library
- Michael D. Smith, Norman Ramsey, and Glenn Holloway, A Generalized Algorithm for Graph Coloring Register Allocation, Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation , Washington, DC, USA. Google ScholarDigital Library
- Goebel, Kurt J. (Mountain View, CA, US), Float register spill cache method, system, and computer program product , United States Patent 5901316, http://www.freepatentsonline.com/5901316.htmlGoogle Scholar
- Subbarao Palacharla , J. E. Smith, Decoupling integer execution in superscalar processors, Proceedings of the 28th annual international symposium on Microarchitecture, p.285--290, November 29--December 01, 1995, Ann Arbor, Michigan, United States. Google ScholarDigital Library
- E. J. McLellan , D. A. Webb, The Alpha 21264 Microprocessor Architecture, Proceedings of the International Conference on Computer Design, p.90, October 05--05, 1998. Google ScholarDigital Library
- Intel Corp. IA--32 Intel® Architecture Software Developer's Manual,Volume 1--3Google Scholar
- AMD Corp. AMD64 Technology AMD64 Architecture Programmer's Manual,Volume 4--5Google Scholar
- AMD Corp. Software Optimization Guide for AMD Athlon" 64 and AMD Opteron" ProcessorsGoogle Scholar
- Intel Corp. Intel® Pentium® 4 Processor Optimization Reference ManualGoogle Scholar
- Weiwu Hu, Fuxin Zhang, Zusong Li, Microarchitecture of the Godson-2 Processor, http://www.loongson.cn/newweb/phpcms/uploadfile/article/uploadfile/200709/20070924095420719.rarGoogle Scholar
- C.Evan Foster III, Harold C.Grossman, An empirical investigation of the Haifa register allocation technique in the GNU C compiler, Southeastcon '92, Proceedings., IEEE 12--15 April 1992, Page(s):776 -- 779 vol.2 Digital Object Identifier 10.1109/SECON.1992.202433.Google Scholar
Index Terms
- Exploiting idle register classes for fast spill destination
Recommendations
A Hardware/Software Cooperative Custom Register Binding Approach for Register Spill Elimination in Application-Specific Instruction Set Processors
Application-Specific Instruction set Processor (ASIP) has become an important design choice for embedded systems. It can achieve both high flexibility offered by the base processor core and high performance and energy efficiency offered by the dedicated ...
Partitioning Variables across Register Windows to Reduce Spill Code in a Low-Power Processor
Low-power embedded processors utilize compact instruction encodings to achieve small code size. Such encodings place tight restrictions on the number of bits available to encode operand specifiers and, thus, on the number of architected registers. As a ...
Comments