skip to main content
10.1145/800046.801649acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free Access

Very Long Instruction Word architectures and the ELI-512

Published:13 June 1983Publication History

ABSTRACT

By compiling ordinary scientific applications programs with a radical technique called trace scheduling, we are generating code for a parallel machine that will run these programs faster than an equivalent sequential machine—we expect 10 to 30 times faster.

Trace scheduling generates code for machines called Very Long Instruction Word architectures. In Very Long Instruction Word machines, many statically scheduled, tightly coupled, fine-grained operations execute in parallel within a single instruction stream. VLIWs are more parallel extensions of several current architectures.

These current architectures have never cracked a fundamental barrier. The speedup they get from parallelism is never more than a factor of 2 to 3. Not that we couldn't build more parallel machines of this type; but until trace scheduling we didn't know how to generate code for them. Trace scheduling finds sufficient parallelism in ordinary code to justify thinking about a highly parallel VLIW.

At Yale we are actually building one. Our machine, the ELI-512, has a horizontal instruction word of over 500 bits and will do 10 to 30 RISC-level operations per cycle [Patterson 82]. ELI stands for Enormously Longword Instructions; 512 is the size of the instruction word we hope to achieve. (The current design has a 1200-bit instruction word.)

Once it became clear that we could actually compile code for a VLIW machine, some new questions appeared, and answers are presented in this paper. How do we put enough tests in each cycle without making the machine too big? How do we put enough memory references in each cycle without making the machine too slow?

References

  1. 1.A. V. Aho and J. D. Ullman. Principles of Compiler Design. Addison-Wesley, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.Dasgupta, S. The Organization of Microprogram Stores. ACM Comp. Surv. 11(1):39-65, Mar. 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.J. A. Fisher. An effective packing method for use with 2n-way jump instruction hardware. In 13th annual microprogramming workshop, pages 64-75. ACM Special Interest Group on Microprogramming, November 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.J. A. Fisher. Trace scheduling: A technique for global microcode compaction. IEEE Transactions on Computers c-30(7):478-490, July 1981.Google ScholarGoogle Scholar
  5. 5.C. C. Foster and E. M. Riseman. Percolation of code to enhance parallel dispatching and execution. IEEE Transactions on Computers 21(12):1411-1415, December 1972.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.T. R. Gross and J. L. Hennessy. Optimizing Delayed Branches. In 15th annual workshop on microprogramming, pages 114-120. ACM Special Interest Group on Microprogramming, October 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.J. Hennessy, N. Jouppi, S. Przbyski, C. Rowen, T. Gross, F. Baskett, and J. Gill. MIPS: A Microprocessor Architecture. In 15th annual workshop on microprogramming, pages 17-22. ACM Special Interest Group on Microprogramming, October 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.D. Jacobs, J. Prins, P. Siegel and K. Wilson. Monte carlo techniques in code optimization. In 15th annual workshop on microprogramming, pages 143-148. ACM Special Interest Group on Microprogramming, October 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.Alexandru Nicolau and Joseph A. Fisher. Using an oracle to measure parallelism in single instruction stream programs. In 14th annual microprogramming workshop, pages 171-182. ACM Special Interest Group on Microprogramming, October 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.D. A. Padua, D. J. Kuck, and D. H. Lawrie. High speed multiprocessors and compilation techniques. IEEE Transactions on Computers 29(9):763-776, September 1980.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.D. A. Patterson, K. Lew, and R. Tuck. Towards an efficient machine-independent language for microprogramming. In 12th annual microprogramming workshop, pages 22-35. ACM Special Interest Group on Microprogramming, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12.D. A. Patterson and C. H. Sequin. A VLSI RISC. Computer 15(9):8-21, SEPT 1982.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.E. M. Riseman and C. C. Foster. The inhibition of potential parallelism by conditional jumps. IEEE Transactions on Computers 21(12):1405-1411, December 1972.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.G. S. Tjaden and M. J. Flynn. Detection and parallel execution of independent instructions. IEEE Transactions on Computers 19(10):889-895, October 1970.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.Tokoro, M.; Takizuka, T.; Tamura E. and Yamaura, I. Towards an Efficient Machine-Independent Language for Microprogramming. In 11th Annual Microprogramming Workshop, pages 41-50. SIGMICRO, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Very Long Instruction Word architectures and the ELI-512

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader