Abstract
A simple technique is presented which allows an optimizing compiler to more precisely compare the performance of alternative instruction sequences on a complex RISC architecture so that the better sequence can be chosen. This technique may be faster than current techniques, and has the advantage that minor modifications to the hardware do not require any changes to the compiler (not even recompilation), and yet have an immediate effect on instruction scheduling decisions.
- Appel, Andrew. Private communication, July, 1991.Google Scholar
- ATT. WE® DSP32C Digital Signal Processor Advance Data Sheet. ATT Microelectronics, Allentown, PA, May 1988.Google Scholar
- Baker, Henry, and Parker, Clinton. Micro SPL. Synapse Computer Services, Sept. 1979.Google Scholar
- Baker, Henry, and Parker, Clinton. "High Level Language Programs Run Ten Times Faster in Microstore". Tech. Rept., Synapse Computer Services, 1980.Google Scholar
- Bradlee, David G., et al. "The Marion System for Retargetable Instruction Scheduling". Proc. ACM PLDI'91, Sigplan Not. 26, 6 (June 1991), 229-240. Google ScholarDigital Library
- Chambers, C., and Ungar, D. "Customization: Optimizing Compiler Technology for SELF, A Dynamically-Typed Object-Oriented Programming Language". Proc. ACM PLDI'89, Sigplan Not. 24, 7 (July 1989), 146-160. Google ScholarDigital Library
- Chambers, C., Ungar, D., and Lee, E. "An Efficient Implementation of SELF, A Dynamically-Typed Object-Oriented Programming Language". Proc. OOPSLA'89, Sigplan Not. 24, 10 (Oct. 1989), 49-70. Google ScholarDigital Library
- Deutsch, L.P., and Schiffman, A.M. "Efficient Implementation of the Smalltalk-80 System". Proc. 11'th ACM POPL, Salt Lake City, UT, Jan. 1984, 297-302. Google ScholarDigital Library
- Ellis, John R. Bulldog: A Compiler for VLIW Architectures. MIT Press, Cambridge, MA, 1986. Google ScholarDigital Library
- Gibbons, P.B., and Muchnick, S.S. "Efficient instruction scheduling for a pipelined architecture". Proc. ACM Symp. on Compiler Constr., Sigplan Not. 21, 7 (July 1986), 11-16. Google ScholarDigital Library
- Hennessy, John, and Gross, Thomas. "Postpass Code Optimization of Pipeline Constraints". ACM TOPLAS 5, 3 (July 1983), 422-448. Google ScholarDigital Library
- Intel Corp. i860TM [XR] 64-Bit Microprocessor Programmer's Reference Manual. #240329-002, 1989.Google Scholar
- Intel Corp. i860TMMicroprocessor Family Programmer's Reference Manual. #240875-001, 1991. Google ScholarDigital Library
- Intel Corp. i860TM 64-bit Microprocessor Simulator and Debugger Reference Manual, Ver. 3. #240437-003, Jan. 1990.Google Scholar
- Keppel, David. "A Portable Interface for On-The-Fly Instruction Space Modification". Proc. 4'th ACM ASPLOS, Sigplan Not. 26, 4 (April 1991), 86-95. Google ScholarDigital Library
- Knuth, Donald E. The Art of Computer Programming Vol. I: Fundamental Algorithms, 2nd Ed. Addison-Wesley, Reading, MA, 1973, 634 p. Google ScholarDigital Library
- Kogge, P.M. The Architecture of Pipelined Computers. McGraw-Hill, New York, 1981.Google Scholar
- Massalin, Henry. "Superoptimizer--A Look at the Smallest Program". Proc. ACM ASPLOS'87, Sigplan Not. 22, 10 (Oct. 1987), 122-126. Google ScholarDigital Library
- Morris, W.G. "CCG: A Prototype Coagulating Code Generator". Proc. ACM PLDI'91, Sigplan Not. 26, 6 (June 1991), 45-58. Google ScholarDigital Library
- Moyer, Steven A. "Performance of the iPSC/860 Node Architecture". IPC-TR-91-007, Inst. for Parallel Comp., Eng. & Applied Sci., U. of Va., May 1991. Google ScholarDigital Library
- Scott, D.S., and Withers, G.R. "Performance and Assembly Language Programming of the iPSC/860 System". Tech. Report, Intel Supercomputer Systems Div., Beaverton, OR, 1990.Google Scholar
- Texas Inst. TMS320C30: The Third Generation of the TMS320 Family of Digital Signal Processors. Texas Instruments, Feb. 1988.Google Scholar
- Wirth, Niklaus. "From Programming Language Design to Computer Construction". CACM 28, 2 (Feb. 1985), 160-164. Google ScholarDigital Library
- Xerox Corp. ALTO: A Personal Computer System Hardware Manual. Xerox PARC, Palo Alto, CA, Jan. 1977.Google Scholar
Index Terms
- Precise instruction scheduling without a precise machine model
Recommendations
Precise Runahead Execution
Runahead execution improves processor performance by accurately prefetching long-latency memory accesses. When a long-latency load causes the instruction window to fill up and halt the pipeline, the processor enters runahead mode and keeps speculatively ...
Lazy instruction scheduling: keeping performance, reducing power
ISLPED '08: Proceedings of the 2008 international symposium on Low Power Electronics & DesignAn important approach to reduce power dissipation is reducing the number of instructions executed by the processor. To achieve this goal, this paper introduces a novel instruction scheduling algorithm that executes an instruction only when its result is ...
Comments