skip to main content
article

Reducing branch predictor leakage energy by exploiting loops

Authors Info & Claims
Published:01 May 2007Publication History
Skip Abstract Section

Abstract

With the scaling of technology, leakage energy will become the dominant source of energy consumption. Besides cache memories, branch predictors are among the largest on-chip array structures and consume nontrivial leakage energy. This paper proposes two cost-effective loop-based strategies to reduce the branch predictor leakage without impacting prediction accuracy or performance. The loop-based approaches exploit the fact that loops usually only contain a small number of instructions and, hence, even fewer branch instructions while taking a significant fraction of the execution time. Consequently, all the nonactive entries of branch predictors can be placed into the low leakage mode during the loop execution in order to reduce leakage energy. Compiler and circuit supports are discussed to implement the proposed leakage-reduction strategies. Compared to the recently proposed decay-based approach, our experimental results show that the loop-based approach can extract 16.2% more dead time of the branch predictor, on average, leading to more leakage energy savings without impacting the branch prediction accuracy and performance.

References

  1. Chang, P. Y., Patt, E. H., and Patt, Y. N. 1995. Alternative implementations of hybrid branch predictors. In Proceedings of the 28th Annual International Symposium on Microarchitecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Chaver, D., Huang, L. P., and Huang, M. C. 2003. Branch prediction on demand: an energy-efficient solution. In Proceedings of ISLPED. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Evers, M., Patt, P. Y. C., and Patt, Y. N. 1996. Using hybrid branch predictors to improve branch prediction accuracy in presence of context switches. In Proceedings of the 23rd International Symposium on Computer Architecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Flautner, K., Kim, N. S. et al. 2002. Drowsy caches: simple techniques for reducing leakage power. In Proceedings of the International Symposium on Computer Architecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Heo, S., Barr, K., Asanovic, M. H., and Asanovic, K. 2002. Dynamic fine-grain leakage reduction using leakage-biased bitlines. In Proc. of ISCA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Hoogerbrugger, J. 2000. Dynamic branch prediction for a vliw processor. In Proc. of IEEE PACT. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Hu, Z., Juang, P., Martonosi, K. S. D. C., and Martonosi, M. 2002. Applying decay strategies to branch predictors for leakage energy savings. In Proc. of ICCD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Intel Xscale Microarchitecture Technical Summary. 2001. In Intel Technical Report.Google ScholarGoogle Scholar
  9. Jimenez, D. A., Lin, S. W. K., and Lin, C. 2000. The impact of delay on the design of branch predictors. In Proceedings of the 33th Annual International Symposium on Microarchitecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kaxiras, S., Hu, Z., et al. 2001. Cache decay: Exploiting generational behavior to reduce cache leakage power. In Proc. of ISCA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kim, N. S., Flautner, K., et al. 2002. Drowsy instruction caches. In Proceedings of the 35th Annual ACM/IEEE International Symposium on Microarchitecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Lee, C., Potkonjak, M., and Mangione-Smith, W. H. 1997. Mediabench: A tool for evaluating and synthesizing multimedia and communications systems. In Proc. the International Symposium on Microarchitecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. McFarling, S. 1993. Combining branch predictors. In Technical Note TN-36, DEC WRL.Google ScholarGoogle Scholar
  14. Parikh, D., Skadron, K., Stan, Y. Z. M. B., and Stan, M. R. 2002. Power issues related to branch prediction. In Proceedings of the 8th International Symposium on High-Performance Computer Architecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Powell, M. D., Yang, S., Vijaykumar, B. F. K. R., and Vijaykumar, T. N. 2001. Reducing leakage in a high-performance deep-submicron instruction cache. In IEEE Transactions on VLSI, Vol. 9, No. 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Seznec, A., Felix, S., Sazeides, V. K., and Sazeides, Y. 2002. Design tradeoffs for the alpha ev8 conditional branch predictor. In Proceedings of the 29th International Symposium on Computer Architecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Sherwood, T., Perelman, E., Calder, G. H. S. S., and Calder, B. 2003. Discovering and exploiting program phases. In IEEE Micro. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Skadron, K., Stan, T. A., and Stan, M. R. 2002. Control-theoretic techniques and thermal-rc modeling for accurate and localized dynamic thermal management. In Proceedings of HPCA-8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Smith, J. E. 1981. A study of branch prediction strategies. In Proceedings of the International Symposium on Computer Architecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Spec Homepage. In http://www.spec.org.Google ScholarGoogle Scholar
  21. Trimaran Homepage. In http://www.trimaran.org.Google ScholarGoogle Scholar
  22. Ye, Y., De, S. B., and De, V. 1998. A new technique for standby leakage reduction in high-performance circuits. In Proc. of the Symposium on VLSI Circuits.Google ScholarGoogle Scholar
  23. Zhang, W., Hu, J. S., et al. 2002. Compiler-directed instruction cache leakage optimization. In Proceedings of the 35th Annual ACM/IEEE International Symposium on Microarchitecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Zhou, H., Toburen, M. C., Conte, E. R., and Conte, T. M. 2001. Adaptive mode control: a static power-efficient cache design. In Proc. of PACT. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reducing branch predictor leakage energy by exploiting loops

            Recommendations

            Reviews

            Ronaldo A. L. Goncalves

            Energy efficiency is an ongoing goal in computer system design. This paper proposes compiler-directed techniques to mark those branches that are inside of a loop body, putting the prediction table entries for the other branches in an energy-saving mode during the loop execution, and thus reducing the leakage energy of branch predictors while maintaining accuracy and performance. These techniques work on both innermost and outermost loops, and require some hardware support to turn the mechanism on or off. Very long instruction word (VLIW) is the target architecture. This is a simple and well-defined idea to control partial energy consumption. The proposed approach is well explained, and it is clear where the benefits can be extracted. However, the experiments were carried out on a prediction table with fixed size, restricting the importance of the results once the advantages of the proposal are directly proportional to that size. Also, the results are shown only in terms of dead time, instead of energy reduction or instructions per cycle, making it impossible to see the practical advantages or the real impact on performance. An unfavorable aspect is that the approach requires the recompilation of current applications to include special on/off instructions. In any case, energy usually wasted can be retrieved through this approach. Online Computing Reviews Service

            Access critical reviews of Computing literature here

            Become a reviewer for Computing Reviews.

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Embedded Computing Systems
              ACM Transactions on Embedded Computing Systems  Volume 6, Issue 2
              SPECIAL ISSUE SCOPES 2005
              May 2007
              119 pages
              ISSN:1539-9087
              EISSN:1558-3465
              DOI:10.1145/1234675
              Issue’s Table of Contents

              Copyright © 2007 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 May 2007
              Published in tecs Volume 6, Issue 2

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader