Article

Latency and energy aware value prediction for high-frequency processors

Authors:
Ravi Bhargava

The University of Texas at Austin, Austin, Texas

The University of Texas at Austin, Austin, Texas
View Profile

,
Lizy K. John

The University of Texas at Austin, Austin, Texas

The University of Texas at Austin, Austin, Texas
View Profile

ICS '02: Proceedings of the 16th international conference on SupercomputingJune 2002Pages 45–56https://doi.org/10.1145/514191.514201

Published:22 June 2002Publication History

ICS '02: Proceedings of the 16th international conference on Supercomputing

Pages 45–56

ABSTRACT

This work addresses the issues of access latency and energy consumption in value predictor design for high-frequency, wide-issue microprocessors. Previous value prediction research allows for generous assumptions regarding table configurations and access conditions, while ignoring prediction latencies and energy issues. However, the latency of a high-performance value predictor cannot always be completely hidden by the early stages of the instruction pipeline as previously assumed, and it causes noticeable performance degradation versus unconstrained value prediction. This paper describes and compares several variations of basic value prediction methods: at fetch, post-decode, and decoupled.The performance of at-fetch and post-decode value predictors is limited by the high access latency of accurate predictor configurations. Decoupled value prediction excels at overcoming the high-frequency table access constraints by placing completion-time predictions into a separate and easily accessible storage. However, it has high energy requirements. We study a value prediction approach that combines the latency-friendly approach of decoupled value prediction with a more energy-efficient implementation. The traditional PC-indexed prediction tables are removed and replaced by a queue of prediction traces. This latency and energy aware method of maintaining and distributing speculated values leads to a 58%-95% reduction in value predictor energy consumption versus known value prediction techniques while still maintaining high performance.

References

V. Agarwal, M. S. Hrishikesh, S. W. Keckler, and D. Burger. Clock rate versus IPC: The end of the road for conventional microarchitectures. In 27th International Symposium on Computer Architecture pages 248--259, Jun 2000. Google ScholarDigital Library
H. V. B. Goeman and K. D. Bosschere. Differential FCM: Increasing value prediction accuracy by improving table sage efficiency. In 7th International Symposium on High Performance Computer Architecture Jan 2001. Google ScholarDigital Library
R. Bhargava and L. K. John. Value predictor design for high-frequency microprocessors. Technical Report TR-020508-01, The University of Texas at Austin, Laboratory for Computer Architecture, May 2002. http://www.ece.utexas.ed/projects/ece/lca.Google Scholar
M. Burtscher and B. G. Zorn. Hybridizing and coalescing load value predictors. In International Conference on Computer Design pages 81--92, Sep 2000. Google ScholarDigital Library
B. Calder, G. Reinman, and D. M. Tullsen. Selective value prediction. In 25th International Symposium on Computer Architecture pages 64--74, May 1999. Google ScholarDigital Library
R. F. Cmelik and D. Keppel. Shade: A fast instruction-set simulator for execution profiling. Technical Report SMLI 93-12 and UWCSE 93-06-06, Sun Microsystems Laboratories, Incorporated, and the University of Washington, 1993. Google ScholarDigital Library
R. J. Eickemeyer and S. Vassiliadis. A load- instr ction unit for pipelined processors. IBM Journal of Researh and Development 1993.Google Scholar
B. Fields, S. Rubin, and R. Bodik. Focusing processor policies via critical-path prediction. In 28th International Symposium on Computer Architecture pages 74--85, Jul 2001. Google ScholarDigital Library
F. Gabbay and A. Mendelson. Speculative execution based on value prediction. Technical Report 1080, Technion - Israel Institute of Technology, Nov 1996.Google Scholar
F. Gabbay and A. Mendelson. The effect of instruction fetch bandwidth on value prediction. In 25th International Symposium on Computer Architecture pages 272--281, June 1998. Google ScholarDigital Library
J. Gonzalez and A. Gonzalez. The potential of data value speculation to boost ILP. In International Conference on Supercomputing pages 21--28, July 1998. Google ScholarDigital Library
S. Lee, Y. Wang, and P. Yew. Decoupled value prediction on trace processors. In 6th International Symposium on High Performance Computer Architecture pages 231--240, Jan 2000.Google Scholar
S. Lee and P. Yew. On some implementation issues for value prediction on wide-issue ILP processors. In International Conference on Parallel Architectures and Compilation Techniques pages 145--156, Oct 2000. Google ScholarDigital Library
M. H. Lipasti and J. P. Shen. Exceeding the dataflow limit via value prediction. In 29th International Symposium on Microarchitectures pages 226--237, Dec 1996. Google ScholarDigital Library
M. H. Lipasti, C. B. Wilkerson, and J. P. Shen. Value locality and load value prediction. In 7th International Conference on Architectural Support for Programming Languages and Operating Systems pages 138--147, Oct 1996. Google ScholarDigital Library
R. Moreno, L. Pinuel, S. del Pino, and F. Tirado. A power perspective of value speculation for superscalar microprocessors. In International Conference on Computer Design pages 147--154, Sep 2000. Google ScholarDigital Library
S. J. Patel. Trace Cache Design for Wide-Issue Superscalar Processors PhD thesis, The University of Michigan, 1999. Google ScholarDigital Library
L. Pinuel, R. Moreno, and F. Tirado. Implementation of hybrid context-based value predictors using value sequence classification. In 4th Euro-Par Conference Aug--Sep 1999. Google ScholarDigital Library
A. Ramirez, J. Larriba-Pey, C. Navarro, J. Torrellas, and M. Valero. Software trace cache. In International Conference on Supercomputing pages 119--126, Jun 1999. Google ScholarDigital Library
G. Reinman, T. Austin, and B. Calder. A scalable front-end architecture for fast instruction delivery. In 26th International Symposium on Computer Architecture pages 234--245, May 1999. Google ScholarDigital Library
G. Reinman and N. Jouppi. An integrated cache timing and power model, 1999. COMPAQ Western Research Lab.Google Scholar
B. Rychlik, J. Faistl, B. Krug, and J. P. Shen. Efficacy and performance impact of value prediction. In International Conference on Parallel Architectures and Compilation Techniques pages 148--154, Oct 1998. Google ScholarDigital Library
B. Rychlik, J. W. Faistl, B. P. Krug, A. Y. Kurland, J. J. Sung, M. N. Velev, and J. P. Shen. Efficient and accurate value prediction sing dynamic classification. Technical report, Carnegie Mellon University, 1998.Google Scholar
Y. Sazeides and J. E. Smith. The predictability of data values. In 30th International Symposium on Microarchitecture pages 248--258, Dec 1997. Google ScholarDigital Library
Semiconductor Industry Association. The national technology roadmap for semiconductors, 1999.Google Scholar
M. Smotherman and M. Franklin. Improving CISC instruction decoding performance sing a fill unit. In 28th International Symposium on Microarchitecture pages 219--229, Nov 1995. Google ScholarDigital Library
Standard Performance Evaluation Corporation. SPEC CPU2000 Benchmarks. http://www.spec.org/osg/cp2000/.Google Scholar
D. M. Tullsen and J. S. Seng. Storageless value prediction sing prior register values. In 25th International Symposium on Computer Architecture pages 270--279, May 1999. Google ScholarDigital Library
E. Tune, D. Liang, D. M. Tullsen, and B. Calder. Dynamic prediction of critical path instructions. In 7th International Symposium on High Performance Computer Architecture Jan 2001. Google ScholarDigital Library
K. Wang and M. Franklin. Highly accurate data value prediction using hybrid predictors. In 30th International Symposium on Microarchitecture pages 281--290, Dec 1997. Google ScholarDigital Library
S. Wilton and N. Jouppi. Cacti: An enhanced cache access and cycle time model. IEEE Journal of Solid-State Circuits 31(5): 677--688, May 1996.Google ScholarCross Ref

Index Terms

Latency and energy aware value prediction for high-frequency processors
1. Computer systems organization
  1. Architectures
    1. Serial architectures

Recommendations

Asymmetrically banked value-aware register files for low-energy and high-performance

Designing high-performance low-energy register files is of critical importance to the continuation of current performance advances in wide-issue and deeply pipelined superscalar microprocessors. In this paper, we propose a new microarchitecture, the ...
Read More
Highly accurate data value prediction using hybrid predictors
MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture

Data dependences (data flow constraints) present a major hurdle to the amount of instruction-level parallelism that can be exploited from a program. Recent work has suggested that the limits imposed by data dependences can be overcome to some extent ...
Read More
Enabling large decoded instruction loop caching for energy-aware embedded processors
CASES '10: Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems

Low energy consumption in embedded processors is increasingly important in step with the system complexity. The on-chip instruction cache (I-cache) is usually a most energy consuming component on the processor chip due to its large size and frequent ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICS '02: Proceedings of the 16th international conference on Supercomputing
June 2002
338 pages
ISBN:1581134835
DOI:10.1145/514191
General Chair:
Kemal Ebcioglu
IBM T.J. Watson Research Center
,
Program Chairs:
Keshav Pingali
Cornell University
,
Alex Nicolau
University of California
Copyright © 2002 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 June 2002
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
complexity-effective design
data speculation
low power
trace cache processors
Qualifiers
- Article
Conference

Acceptance Rates
ICS '02 Paper Acceptance Rate31of144submissions,22%Overall Acceptance Rate584of2,055submissions,28%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 346
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Latency and energy aware value prediction for high-frequency processors

ICS '02: Proceedings of the 16th international conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Asymmetrically banked value-aware register files for low-energy and high-performance

Highly accurate data value prediction using hybrid predictors

Enabling large decoded instruction loop caching for energy-aware embedded processors