skip to main content
10.1145/77726.255153acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article
Free Access

Loop optimization for horizontal microcoded machines

Published:01 June 1990Publication History

ABSTRACT

Long Instruction Word (LIW) architectures exploit parallelism between various functional units. In order to produce efficient code for such an architecture, the microcode compiler will have to expose a relatively large degree of fine grain parallelism and it will have to take into account the fine level characteristics of the architecture. This paper aims at describing a microcode compiler developed at IRISA for such architectures. After a brief overview of the compilation process, we focus on loop scheduling techniques. The software pipelining algorithm is firstly described. Then a new unrolling-based optimization algorithm is introduced and compared to the classical software pipelining algorithm. This algorithm differs from the traditional loop unrolling algorithm because the unrolling of the loop is only used to find a cyclic scheduling of the loop, then this scheduling allows a software pipelining to be constructed.

References

  1. 1.A. V. Aho and J. D. Ullman. Principles of Compiler Design. Addison-Wesley, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.A. Aiken and A. Nicolau. A development for horizontal microcode programs. MICRO 19, pages 23- 31, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.A. Aiken and A. Nicolau. Optimal loop parallelization. Proceedings of ~he SIGPLAN '88, pages 308- 317, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.F. Bodin, F. Charot, and C. Wagner. Overview of a high-performance programmable pipeline architecture. A CM Supercomputin# 89 (Crete), pages 398-409, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.S. Dasgupta and J. Tartar. The identification of maximal parallelism in straight-line microprograms. IEEE Transactions on Computers, 25(10):086-991, 1976.Google ScholarGoogle Scholar
  6. 6.S. Davidson, D, Landskov, B. D. Shriver, and P. W. Mallett. Local microcode compaction techniques. Computing Survey, 12(3):261-294, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.D.J. Dewit. A Machine independent approach to the Production of Horizontal Microcode. PhD thesis, University of Michigan, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.C. Eisenbeis. Optimisation automatique de programmes sur array-processors. Th~se d'universit~ de Pierre et Marie Curie Paris 6, J uin 1986.Google ScholarGoogle Scholar
  9. 9.C. Eisenbeis. Optimization of horizontal microcode generation for loop structures. A CM Supercomputing 88, pages 453-465, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.C. Eisenbeis, W. 3alby, and A. Lichnewsky. Squeezing more cpu performance out of a eray-2 by vector block scheduling. Florida Supercomputing 88, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.J.A. Fisher. Trace scheduling: A technique for global microcode compaction. IEEE Transactions on Computers, 30(7):478-490, 1981.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12.M. R. Garey and D.S. Johnson. Computer8 and Intractability, A Guide to the Theory of NP- Completeness. W.H. Freeman and company, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.T. Gross and M. S. Lain. Compilation for a highperformance systolic array. SIGPLAN'86 Symposium on Compiler Cons~ruc~ios, pages 2?-38, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.R.W. Hockney and C.R. Jcsshope. Parallel Computers. Adam Hilger Ltd, Bristol, 1981.Google ScholarGoogle Scholar
  15. 15.M. Lain. A Systolic Array Optimizing Compiler. PhD thesis, Carnegie Mellon University, May 1987.Google ScholarGoogle Scholar
  16. 16.A. Nicolau. A Fine.Grain Parallelizinfl Compiler, ber 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.D.A. Padua, D.J. Kuck, R.H. Kuhn, B. Leasure, and M. Wolfe. Dependence graphs and compiler optimisations. A CM Symposium on Principles of Programming Languages, pages 207-218, 198 I. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.3.H. Patel and E.S. Davidson. Improving the throughput by insertion of delays. Proc 3rd Annual Syrup. on Computer Architecture, pages 159- 164, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19.B.R. Rau and C.D. Glaescr. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. IEEE, pages 183-198, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20.R.F. Touzeau. A fortran compiler for the fps-164 scientific computer. Proc. of the A CM SIGPLAN '8J Syrup. on Compiler Construction, pages 48-57, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Loop optimization for horizontal microcoded machines

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in
                  • Published in

                    cover image ACM Conferences
                    ICS '90: Proceedings of the 4th international conference on Supercomputing
                    June 1990
                    492 pages
                    ISBN:0897913698
                    DOI:10.1145/77726
                    • cover image ACM SIGARCH Computer Architecture News
                      ACM SIGARCH Computer Architecture News  Volume 18, Issue 3b
                      Special Issue: Proceedings of the 4th international conference on Supercomputing
                      Sept. 1990
                      489 pages
                      ISSN:0163-5964
                      DOI:10.1145/255129
                      Issue’s Table of Contents

                    Copyright © 1990 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 1 June 1990

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • Article

                    Acceptance Rates

                    Overall Acceptance Rate584of2,055submissions,28%

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader