ABSTRACT
We propose an integrated architectural/physical-planning approach named priority assignment optimization to minimize the current surge in high performance power efficient clock-gated microprocessors. The proposed approach balances the current demands across the floorplan by assigning optimized priorities to the functional units (FUs). Two complementary methods -- physical planning with soft modules and issue pattern management - to enhance our proposed approach are also discussed for various applications. Experimental results show that the proposed approach reduces the peak noise by 11.75% and consequently, the decoupling capacitance (Decap) requirement by 24.22% without any degradation in IPC (Instruction Per Cycle). We also show that our approach does not increase the clock period for the 0.18μm technology and beyond.
- H. H. Chen and J. S. Neely. Interconnect and circuit modeling techniques for full-Chip power supply noise analysis. IEEE Trans. Component, Packing and Manufacturing Tech., Part B, vol. 21, pages 209--215, Aug. 1998.Google ScholarCross Ref
- J. Pouwelse, and K. Langendoen, and H. Sips. Dynamic voltage scaling on a low-power microprocessor. 7th International Conference on Mobile Computing and Networking (Mobicom), pages 251--259, Jul. 2001. Google ScholarDigital Library
- C. H. Kim and K. Roy. Dynamic Vt SRAM: a leakage tolerant cache memory for low voltage microprocessors. In proceedings of the 2002 International Symposium on Low Power Electronics and Design (ISLPED), pages 251--254. Google ScholarDigital Library
- H. Li, S. Bhunia, Y. Chen, T. N. Vijaykumar, and K. Roy. Deterministic clock gating for microprocessor power reduction. In The 9th International Symposium on High Performance Computer Architecture (HPCA), pages 113--124, Feb. 2003. Google ScholarDigital Library
- K. Wilcox, and S. Manne. Alpha processors: a history of power issues and a look to the future. Cool Chips Tutorial, in Conjuction with the 32nd Annual International Symposium on Microarchitecture, Nov. 1999.Google Scholar
- Lseley Polka, etc., Package-Level Interconnect Design for Optimum Electrical Performance. Intel Technology Journal Q3, 2000.Google Scholar
- U. A Shrivastava and B. L. Bui. Inductance calculation and optimal pin assignment for the design of pin-grid-array and chip carrier packages. In IEEE Trans. on Components, Hybrids, and Manufacturing Technology, Part A, B, C, Volume: 13 Issue: 1, pages: 147--153, Mar. 1990.Google Scholar
- T.-Y. Wang and C. C.-P Chen. Optimization of the power/ground network wire-sizing and spacing based on sequential network simplex algorithm. In Proceedings of International Symposium on Quality Electronic Design, pages: 157--162, 2002. Google ScholarDigital Library
- R. Downing, P. Gebler and G. Katopis, Decoupling capacitor effects on switching noise. In Electrical Performance of Electronic Packaging, pages 148--150, 22--24 Apr. 1992.Google ScholarCross Ref
- M. Powell and T. N. Vijaykumar, Pipeline Damping: A microarchitecture technique to reduce inductive noise in supply voltage. In 30th International Symp. on Computer Architecture (ISCA), Jun. 2003. Google ScholarDigital Library
- S. Zhao, K. Roy and C.-K. Koh. Decoupling Capacitance Allocation and Its Application to Power-Supply Noise-Aware Floorplanning. IEEE Trans. Computer-Aided Design, vol. 81, pp. 81--92, Jan. 2002. Google ScholarDigital Library
- S. Palacharla, N. P. Jouppi and J. E. Smith. Quantifying the complexity of superscalar processors. Technical report CS-TR-96-1038, Dept. of CS., Univ. of Wisconsin, 1996.Google Scholar
- E. S. Fetzer, M. Gibson, A. Klein, N. Calick, C. Zhu, E. Busta, and B. Mohammad. A fully bypassed six-issue integer datapath and register file on the itanium-2 microprocessor. IEEE Journal of Solid-State Circuits, 37(11), pages 1433--1440, Nov. 2002.Google ScholarCross Ref
- L. T. Clark, et. al. An embedded 32-b microprocessor core for low-power and high-performance applications. IEEE Journal of Solid-State Circuits, 36(11), pages 1599--1608, Nov. 2001.Google ScholarCross Ref
- International Technology Roadmap for Semiconductor, 1999.Google Scholar
- M. D. Pant, P. Pant, and D. S. Wills. On-chip decoupling capacitor optimization using architectural level current signature prediction. 13th Annual IEEE International ASIC/SOC Conference, pages 288--292, 2000.Google ScholarCross Ref
- D. Burger and T. M. Austin. The simplescalar tool set, version 2.0. TR1342, University of Wisconsin, Jun. 1997.Google Scholar
- D. Weaver. Pre-compiled little-endian Alpha ISA SPEC2000. binaries. http://www.eecs.umich.edu/~chriswea/benchmarks/spec2000.htmlGoogle Scholar
- A. Agarwal, H. Li, and K. Roy. DRG-cache: a data retention gated-ground cache for low power. In the proceeding of 39th Design Automation Conference (DAC), pages 473--478, Jun. 2002. Google ScholarDigital Library
- Priority assignment optimization for minimization of current surge in high performance power efficient clock-gated microprocessor
Recommendations
Integrated architectural/physical planning approach for minimization of current surge in high performance clock-gated microprocessors
ISLPED '03: Proceedings of the 2003 international symposium on Low power electronics and designWe propose an integrated architectural/physical planning approach to reduce the power supply noise due to current surge in high performance, general-purpose, clock-gated microprocessors. The proposed approach combines dynamic selection of functional ...
Inherently Lower-Power High-Performance Superscalar Architectures
In recent years, reducing power has become an important design goal for high-performance microprocessors. This work attempts to bring the power issue to the earliest phases of microprocessor development, in particular, the stage of defining a chip ...
Branch predictor design and performance estimation for a high performance embedded microprocessor
ASP-DAC '03: Proceedings of the 2003 Asia and South Pacific Design Automation ConferenceAE64000 is a 64-bit embedded processor targeting high-end embedded applications such as HDTV, DVD, and 3D graphics. To achieve a higher performance for the AE64000, we design a branch predictor for the processor, and find the optimum parameters for the ...
Comments