skip to main content
10.1145/1013235.1013318acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
Article

Application adaptive energy efficient clustered architectures

Published: 09 August 2004 Publication History

Abstract

As clock frequency and die area increase, achieving energy efficiency, while distributing a low skew, global clock signal becomes increasingly difficult. Challenges imposed by deep-submicron technologies can be alleviated by using a multiple voltage/multiple frequency island design style, or otherwise called, globally asynchronous, locally synchronous (GALS) design paradigm. This paper proposes a clustered architecture that enables application-adaptive energy efficiency through the use of dynamic voltage scaling for application code that is rendered non-critical for the overall performance, at run-time. As opposed to task scheduling using dynamic voltage scaling (DVS) that exploits workload variations across applications, our approach targets workload variations within the same application, while on-the fly classifying code as critical or non-critical and adapting to changes in the criticality of such code portions. Our results show that application adaptive variable voltage/variable frequency clustered architectures are up to 22% better in energy and 11% better in energy-delay product than their non-adaptive counterparts, while providing up to 31% more energy savings when compared to DVS applied globally.

References

[1]
D. Lackey, et al., "Managing Power and Performance for System-on-Chip Designs using Voltage Islands,' in Proc. Intl. Conf. on Computer-Aided Design (ICCAD), pp. 195--202, Nov. 2002.
[2]
A. Iyer and D. Marculescu, "Power and performance evaluation of globally asynchronous, locally synchronous processors," in Proc. Intl. Symp. on Computer Architecture (ISCA), pp. 158--170, May 2002.
[3]
G. Semeraro, G. Magklis, R. Balasubramonian, D. Albonesi, S. Dwarkadas, and M. Scott, "Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling," in Proc. Intl. Symposium on High-Performance Computer Architecture (HPCA), pp. 29--42, Feb. 2002.
[4]
S. Palacharla, N. P. Jouppi, and J. E. Smith, "Complexity-effective superscalar processors," in Proc. Intl. Symposium on Computer Architecture, ACM Press, pp. 206--218, June 1997.
[5]
G. Semeraro, D.H. Albonesi, S.G. Dropsho, G. Magklis, S. Dwarkadas, M.L. Scott, "Dynamic Frequency and Voltage Control for a Multiple Clock Domain Microarchitecture," in Proc. Intl. Symposium on Microarchitecture (MICRO), pp. 356--367, Nov. 2002.
[6]
D. M. Chapiro, "Globally-Asynchronous Locally-Synchronous Systems", PhD Thesis, Stanford University, Oct. 1984.
[7]
A. Hemani, T. Meincke, S. Kumar, A. Postula, T. Olsson, P. Nilsson, J. Oberg, P. Ellervee, and D. Lundqvist, "Lower Power Consumption in Clock By Using Globally Asynchronous Locally Synchronous Design Style," in Proc. Design Automation Conference (DAC), pp. 873--878, June 1999.
[8]
A. Iyer and D. Marculescu, "Power Efficiency of Multiple Clock, Multiple Voltage Cores," in Proc. IEEE/ACM Intl. Conference on Computer-Aided Design (ICCAD), pp. 379--386, San Jose, CA, Nov. 2002.
[9]
J. Casmira and D. Grunwald, "Dynamic Scheduling Slack," in Proc. Kool Chips Workshop, in conjunction with MICRO 33, Dec. 2000.
[10]
B. Fields, S. Rubin, and R. Bodik, "Focusing Processor Policies via Critical-Path Prediction," in Proc. Intl. Symp. on Computer Architecture (ISCA), pp. 74--85, July 2001.
[11]
E. Tune, D. Liang, D. Tullsen, and B. Calder, "Dynamic Prediction of Critical Path Instructions," in Proc. Intl. Symposium on High Performance Computer Architecture (HPCA), pp. 185--196, Jan. 2001.
[12]
B. Fields, R. Bodik, and M. D. Hill, "Slack: Maximizing Performance under Technological Constraints," in Proc. Intl. Symposium on Computer Architecture (ISCA), pp. 47--58, May 2002.
[13]
E. Tune, D. Tullsen, and B. Calder, "Quantifying Instruction Criticality," in Proc. Intl. Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 104--116, Sept. 2002.
[14]
D. Matzke, "Will Physical Scalability Sabotage Performance Gains?," in IEEE Computer, 30(9):37--39, Sept. 1997.
[15]
G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel, "The Microarchitecture of the Pentium4 Processor," in Intel Technology Journal, Q1 2001.
[16]
R. Kol and R. Ginosar, "Adaptive Synchronization for Multi-Synchronous Systems", 1998 IEEE Intl. Conference on Computer Design (ICCD'98), pp. 188--189, Oct. 1998.
[17]
D. S. Bormann and P. Y. K. Cheung, "Asynchronous Wrapper for Heterogeneous Systems", Proc. Intl. Conference on Computer Design (ICCD), IEEE Computer Society Press, pp. 307--314, Oct. 1997.
[18]
J. Seizovic, "Pipeline Synchronization", Proc. Intl. Symposium on Advanced Research in Asynchronous Circuits and Systems, pp. 87--96, November 1994.
[19]
T. Chelcea and S. M. Nowick, "A Low-Latency FIFO for Mixed-Clock Systems," in Proc. of the IEEE Computer Society Annual Workshop on VLSI (WVLSI'00), pp. 119--126, April 2000.
[20]
J. Seng, E. Tune and D. Tullsen, "Reducing Power with Dynamic Critical Path Information," in Proc. Intl. Symposium on Microarchitecture (MICRO-34), pp. 114--123, Dec. 2001.
[21]
R. Canal, J.M. Parcerisa, and A. Gonzalez, "A Cost-Effective Clustered Architecture," Int. Conf. on Parallel Architectures and Compilation Techniques (PACT'99), pp. 160--168, Oct. 1999.
[22]
L. S. Nielsen, C. Niessen, J. Sparso, and K. van Berkel, "Low-Power Operation Using Self-Timed Circuits and Adaptive Scaling of the Supply Voltage," in IEEE Transactions on Very Large Scale Integration Systems (TVLSI), December 1994.
[23]
D. Burger, and T. Austin, "The SimpleScalar Tool Set, Version 2.0," Technical Report CS-TR-97-1342, Computer Science Department, University of Wisconsin-Madison, 1997.
[24]
D. Brooks, V. Tiwari, and M. Martonosi, "Wattch: A Framework for Architectural-Level Power Analysis and Optimizations," in Proc. Intl. Symposium on Computer Architecture, pp. 83--94, June 2000.

Cited By

View all
  • (2009)Criticality-based optimizations for efficient load processing2009 IEEE 15th International Symposium on High Performance Computer Architecture10.1109/HPCA.2009.4798280(419-430)Online publication date: Feb-2009
  • (2008)Power optimization of embedded real-time systems and their adaptabilityAutomatic Control and Computer Sciences10.3103/S014641160803007342:3(153-162)Online publication date: 27-Jul-2008
  • (2007)Design principles for a virtual multiprocessorProceedings of the 2007 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries10.1145/1292491.1292500(76-82)Online publication date: 2-Oct-2007
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISLPED '04: Proceedings of the 2004 international symposium on Low power electronics and design
August 2004
414 pages
ISBN:1581139292
DOI:10.1145/1013235
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 August 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustered architectures
  2. dynamic voltage scaling

Qualifiers

  • Article

Conference

ISLPED04
Sponsor:
ISLPED04: International Symposium on Low Power Electronics and Design
August 9 - 11, 2004
California, Newport Beach, USA

Acceptance Rates

Overall Acceptance Rate 326 of 951 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2009)Criticality-based optimizations for efficient load processing2009 IEEE 15th International Symposium on High Performance Computer Architecture10.1109/HPCA.2009.4798280(419-430)Online publication date: Feb-2009
  • (2008)Power optimization of embedded real-time systems and their adaptabilityAutomatic Control and Computer Sciences10.3103/S014641160803007342:3(153-162)Online publication date: 27-Jul-2008
  • (2007)Design principles for a virtual multiprocessorProceedings of the 2007 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries10.1145/1292491.1292500(76-82)Online publication date: 2-Oct-2007
  • (2006)Tile size selection for low-power tile-based architecturesProceedings of the 3rd conference on Computing frontiers10.1145/1128022.1128036(83-94)Online publication date: 3-May-2006
  • (2006)Energy-aware dynamic resource allocation heuristics for clustered processorsCanadian Journal of Electrical and Computer Engineering10.1109/CJECE.2006.25919731:3(117-125)Online publication date: 2006
  • (2005)Dynamic instruction cascading on GALS microprocessorsProceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation10.1007/11556930_4(30-39)Online publication date: 21-Sep-2005

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media