skip to main content
research-article
Free Access

Optimal trace scheduling using enumeration

Published:23 March 2009Publication History
Skip Abstract Section

Abstract

This article presents the first optimal algorithm for trace scheduling. The trace is a global scheduling region used by compilers to exploit instruction-level parallelism across basic block boundaries. Several heuristic techniques have been proposed for trace scheduling, but the precision of these techniques has not been studied relative to optimality. This article describes a technique for finding provably optimal trace schedules, where optimality is defined in terms of a weighted sum of schedule lengths across all code paths in a trace. The optimal algorithm uses branch-and-bound enumeration to efficiently explore the entire solution space. Experimental evaluation of the algorithm shows that, with a time limit of 1 s per problem, 91% of the hard trace scheduling problems in the SPEC CPU 2006 Integer Benchmarks are solved optimally. For 58% of these hard problems, the optimal schedule is improved compared to that produced by a heuristic scheduler with a geometric mean improvement of 3.2% in weighted schedule length and 18% in compensation code size.

References

  1. Bernstein, D. and Rodeh, M. 1991. Global scheduling for super-scalar machines. In Proceedings of the Conference on Programming Language Design and Implementation. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bharadwaj, J., Menezes, K., and Mckinsey, C. 2000. Wavefront scheduling: path based data representation and scheduling of subgraphs. J. Instruc.-Level Paral., 2, 7.Google ScholarGoogle Scholar
  3. Chekuri, C., Johnson, R., Motwani, R., Natarajan, B., Rau, B., and Schlansker, M. 1996. Profile-driven instruction-level parallel scheduling with applications to superblocks. In Proceedings of the 29th International Symposium on Microarchitecture. IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chou, H. and Chung, C. 1995. An optimal instruction scheduler for superscalar processor. IEEE Trans. Parall. Distrib. Syst. 6, 303--313. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Fisher, J. 1981. Trace Scheduling: a technique for global micro-code compaction. IEEE Trans. Comput. 30, 478--490. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Freudenberger, S., Gross, T., and Lowney, P. 1994. Avoidance and suppression of compensation code in a trace scheduling compiler. ACM Trans. Program. Lang. Syst. 16, 1156--1214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Havanki, W., Banerjia, S., and Conte, T., 1998. Treegion scheduling for wide-issue processors. In Proceedings of the 4th International Symposium on High-Performance Computer Architecture (HPCA-4), IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Hennessy, J. and Gross, T. 1983. Post-pass code optimization of pipeline constraints. ACM Trans. Program. Lang. Syst. 5, 422--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Hwu, W., Mahlke, S., Chen, W., Chang, P., Warter, N., Bringmann, R., Ouellette, R., Hank, R., Kiyohara, T., Haab, G., Holm J., and Lavery, D. 1993. The superblock: an effective technique for VLIW and superscalar compilation. J. Supercomput. 7, 229--248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Lah, J. and Atkin, D. 1983. Tree compaction of micro-programs. In Proceedings of the 16th Annual Workshop on Micro-Programming. ACM, New York.Google ScholarGoogle Scholar
  11. Langevin, M. and Cerny, E. 1996. A recursive technique for computing lower-bound performance of schedules. ACM Trans. Des. Autom. Electron. Syst. 1, 443--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Linn, J. 1983. SRDAG Compaction—a generalization of trace scheduling to increase the use of global context information. In Proceedings of the 16th Annual Workshop on Micro-Programming. ACM, New York.Google ScholarGoogle Scholar
  13. Mahlke, S., Lin, D., Chen, W., Hank, R., and Bringmann, R. 1992. Effective compiler support for predicated execution using the hyperblock. In Proceedings of the 25th Annual International Symposium on Microarchitecture. IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Muchnick, S. 1997. Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Narasimhan, M. and Ramanujam, J. 2001. A fast approach to computing exact solutions to the resource-constrained scheduling problem. ACM Trans. Des. Autom. Electron. Syst. 6, 490--500. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Rim, M. and Jain, R. 1994. Lower-bound performance estimation for the high-level synthesis scheduling problem. IEEE Trans. Comput. Aid. Des. Integr. Circ. Syst. 13, 451--458.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Shobaki, G. and Wilken, K. 2004. Optimal superblock scheduling using enumeraion. In Proceedings of the 37th International Symposium on Microarchitecture. IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Shobaki, G. 2006. Optimal global instruction scheduling using enumeration. PhD dissertation, Dept. of Computer Science, University of California, Davis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Smith, M., Horowitz, M., and Lam, M. 1992. Efficient superscalar performance through boosting. In Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Su, B., Ding, S., and Jin, L. 1984. An improvement of trace scheduling for global microcode compaction. In Proceedings of the 17th Annual Workshop on Microprogramming. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Sun Microsystems, 2004. UltraSPARC Processors Documentation, www.sun.com/processors/documentation.html.Google ScholarGoogle Scholar
  22. Wilken, K., Liu, J., and Heffernan, M. 2000. Optimal instruction scheduling using integer programming. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Winkel S. 2007. Optimal versus heuristic global code scheduling. In Proceedings of the 40th International Symposium on Microarchitecture. IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Wolsey, L. 1998. Integer Programming. John Wiley and Sons, New York.Google ScholarGoogle Scholar

Index Terms

  1. Optimal trace scheduling using enumeration

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Architecture and Code Optimization
              ACM Transactions on Architecture and Code Optimization  Volume 5, Issue 4
              March 2009
              122 pages
              ISSN:1544-3566
              EISSN:1544-3973
              DOI:10.1145/1498690
              Issue’s Table of Contents

              Copyright © 2009 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 23 March 2009
              • Accepted: 1 July 2008
              • Revised: 1 March 2008
              • Received: 1 March 2006
              Published in taco Volume 5, Issue 4

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader