Abstract
This paper investigates the limitations on designing a processor which can sustain an execution rate of greater than one instruction per cycle on highly-optimized, non-scientific applications. We have used trace-driven simulations to determine that these applications contain enough instruction independence to sustain an instruction rate of about two instructions per cycle. In a straightforward implementation, cost considerations argue strongly against decoding more than two instructions in one cycle. Given this constraint, the efficiency in instruction fetching rather than the complexity of the execution hardware limits the concurrency attainable at the instruction level.
- Acos 86 R.D. Acosta, J. Kjelstrup, and H.C. Torng, "An Instruction Issuing Approacl~ to Enhancing Performance ill I~'iultiple Functional Unit Processors". }EEE Transactions on Computers, Vol. C-35 (September 1986), pp. 815-828. Google ScholarDigital Library
- Aho 86 A.V. Aho, R. Sethi. and J.D. Ullman, Compilers Principles, Techniques, and Tools. Addison-Wesley Publishing Company, 1986. Google ScholarDigital Library
- Apol 88 Apollo Computer Inc. Marketing Brochure. The series 10000 Personal Supercomputer. Chelmsford. MA, 1988.Google Scholar
- Fost 72 C.C. Foster and E.M. Riseman, "Percolation of Code to Enhance Parallel Dispatching and Execution IEEE Transactions on. Computers, Vol. C-21 (December 1972), pp. 1411-1415.Google Scholar
- Henn 86 J.L. Hennessy, "RISC-Based Processors: Concepts and Prospects". New Froniiers in Corn p,ler Architecture Conference Proceedings (hlarch 1986), pp. 95- 103.Google Scholar
- Kell 75 R.M. Keller, "Look-Ahead Processors". Computing Surveys, Vol.7, No.4 (December 1975). pp. 177-195. Google ScholarDigital Library
- Kuck 72 D.J. Kuck. Y. Muraol,'a, and S. C. hen, "On the Number of Opera. Lions Simultaneously Executable in Fortran-like Programs and Their resulting Speedup". IEEE Transaciions on computers, Vol. C-21 (December 1972), pp. 1293--1310.Google Scholar
- Lee 84 J.K.F. Lee and A.J. Smith, "Branch Prediction Strategies and Branch Target Buffer Design" IEEE Computer (January 1984). pp. 6-22.Google Scholar
- Logr 72 L. Logrippo. "Renamillg in Program Schemas". Proceedings of the IEEE 13th Annual Symposium on switching and Automata Theory. (October 1972), Pp. 67- 70.Google Scholar
- McFa 86 S. McFarling and J. Hennessy, "Reducing t. he Cost of Branches". Proc. 13th Annnual Symposium on Computer Architccture (June 1986), pp. 396-404. Google ScholarDigital Library
- MIPS 86 MIPS Computer Systems, Inc., MIPS Language Programmers Guide (1986).Google Scholar
- Nico 84 A. Nicolau and J.A. Fisher, "Mea.suring tile Parallelism Availa.ble for Very Long Instruction Word Architectures". IEEE Transactions on Computers. vol. C-33 (November 1984), pp. 968-976.Google Scholar
- Rise 72 E.hl. Riselnan and C.C. Foster, "The Inhibition of Potential Parallelism by Conditional Pumps". IEEE Transactions on Computers, Vol. C-21 (December 1972), pp. 1405-1411.Google Scholar
- Slav 88 G.A. Slaxenburg, Phillips Research Laboratories Sunnyvale, Signetics Corporation, Sunnyvale, CA. Personal Correspondence, 12 May 1988.Google Scholar
- Smit 87 J.E. Smith, et. a l, "The ZS-1 Central Processor". Proceedings. Second Internalional Conference on Architectural Support for Programming Languages and Operating Systems (October 1987), pp. 199- 204. Google ScholarCross Ref
- Sohi 87 G.S. Sohi and S. Vajapeyam, "Instruction Issue Logic for High-Performance, Interruptable Pipelined Processors". Proceedings, 14th Annual International Symposium on Computer Architecture (June 1987), pp.27-34. Google ScholarDigital Library
- Tjad 70 G.S. Tjaden and M.J. Flynn, "Detection and Parallel Execution of Independent. Instructions". IEEE Transacactions on Computers, Vol. C-19 (October 1970), pp. 889-895.Google Scholar
- Toma 67 R.M. Tomasulo, "An Efficient Algorithm for Exploiting Multiple Arithmetic Units'. IBM Journal, Vol. 11 (January 1967), pp. 25-33.Google ScholarDigital Library
- Weis 84 S. Weiss and J.E. Smith, "Instrtlction Issue Logic in Pipelined Supercomputers". IEEE Transaction"s on Computer's, Vol. C-33 (November 1984), pp. 1013-1022.Google Scholar
Index Terms
- Limits on multiple instruction issue
Recommendations
Limits on multiple instruction issue
ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systemsThis paper investigates the limitations on designing a processor which can sustain an execution rate of greater than one instruction per cycle on highly-optimized, non-scientific applications. We have used trace-driven simulations to determine that ...
A fill-unit approach to multiple instruction issue
MICRO 27: Proceedings of the 27th annual international symposium on MicroarchitectureMultiple issue of instructions occurs in superscalar and VLIW machines. This paper investigates a third type of machine design, which combines the advantages of code compatibility as in superscalars and the absence of complex dependency-checking logic ...
Multiple instruction issue in the NonStop cyclone processor
ISCA '90: Proceedings of the 17th annual international symposium on Computer ArchitectureThis paper describes the architecture for issuing multiple instructions per clock in the NonStop Cyclone Processor. Pairs of instructions are fetched and decoded by a dual two-stage prefetch pipeline and passed to a dual six-stage pipeline for ...
Comments