ABSTRACT
In the embedded domain, memory usage and energy consumption are critical constraints. Dual width instruction set embedded processors such as the ARM provide a 16-bit instruction set in addition to the 32-bit instruction set to address these concerns. Using 16-bit instructions one can achieve code size reduction and I-cache energy savings at the cost of performance. We have observed that throughout 16-bit Thumb code there exist Thumb instruction pairs that are equivalent to a single ARM instruction. We have developed an approach which uses combination of compiler and architectural support to exploit the above property for improving performance of 16-bit code. We enhance the Thumb instruction set by incorporating Augmenting eXtensions (AX). The task of the compiler is to identify pairs of Thumb instructions that can be safely combined and executed as single ARM instructions. The compiler replaces such pairs of Thumb instructions by AX+Thumb instruction pairs. The AX instruction is coalesced with the immediately following Thumb instruction to generate a single ARM instruction at decode time. Thus, using AX instructions, the compiler can both generate compact 16-bit code and provide hardware with information needed to produce better performing 32-bit code.
- D. Burger and T.M. Austin, "The Simplescalar Tool Set, Version 2.0," Computer Architecture News, pages 13--25, June 1997. Google ScholarDigital Library
- S. Furber, "ARM system Architecture," Publisher: Addison Wesley Longman, 1996. Google ScholarDigital Library
- Intel Corporation, "SA-110 Microprocessor Technical Reference Manual".Google Scholar
- Intel Corporation, "The Intel XScale Core Developer's Manual".Google Scholar
- Intel Corporation, "The Intel PXA250 Applications Processor - A White Paper," February 2002.Google Scholar
- A. Krishnaswamy and R. Gupta, "Profile Guided Selection of ARM and Thumb Instructions," ACM SIGPLAN Joint Conference on Languages Compilers and Tools for Embedded Systems & Software and Compilers for Embedded Systems (LCTES/SCOPES), Berlin, Germany, June 2002. Google ScholarDigital Library
- C. Lee, M. Potkonjak, and W.H. Mangione-Smith, "Mediabench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems," IEEE/ACM International Symposium on Microarchitecture (MICRO), Research Triangle Park, North Carolina, December 1997. Google ScholarDigital Library
- G. Memik, Mangione Smith and Hu, "NetBench: A Benchmarking Suite for Network Processors," IEEE International Conference on Computer-Aided Design, November 2001. Google ScholarDigital Library
- MIPS Technologies, "MIPS32 Architecture for Programmers Volume IV-a: The MIPS16 Application Specific Extension to the MIPS32 Architecture," March 2001.Google Scholar
- J. Montanaro et al., "A 160-MHz, 32-b, 0.5-W CMOS RISC Microprocessor," IEEE Journal of Solid-State Circuits, Vol. 31, No. 11, November 1996. Google ScholarDigital Library
- D. Seal, Editor, "ARM Architecture Reference Manual," Second Addition, Addison-Wesley. Google ScholarDigital Library
- T. Wolf and M. Franklin, "Commbench - A Telecommunications Benchmark for Network Processors," IEEE International Symposium on Performance Analysis of Systems and Software, April 2000. Google ScholarDigital Library
- Kunio Uchiyama, "The SH-5/ST50: An Advanced Microprocessor Core for Networking and Multimedia Applications," Cool Chips III, April 2000.Google Scholar
Index Terms
- Enhancing the performance of 16-bit code using augmenting instructions
Recommendations
Enhancing the performance of 16-bit code using augmenting instructions
Special Issue: Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool support for embedded systems (San Diego, CA).In the embedded domain, memory usage and energy consumption are critical constraints. Dual width instruction set embedded processors such as the ARM provide a 16-bit instruction set in addition to the 32-bit instruction set to address these concerns. ...
Dynamic coalescing for 16-bit instructions
In the embedded domain, memory usage and energy consumption are critical constraints.Embedded processors such as the ARM and MIPS provide a 16-bit instruction set, (called Thumb in the case of the ARM family of processors), in addition to the 32-bit ...
Integrated instruction selection and register allocation for compact code generation exploiting freeform mixing of 16- and 32-bit instructions
CGO '10: Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimizationFor memory constrained embedded systems code size is at least as important as performance. One way of increasing code density is to exploit compact instruction formats, e.g. ARM Thumb, where the processor either operates in standard or compact ...
Comments