| Instruction fetch deferral using static slack |
| Full text |
Publisher Site
,
Pdf
(1.17 MB)
|
| Source
|
International Symposium on Microarchitecture
archive
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
table of contents
Istanbul, Turkey
SESSION: Multithreading I
table of contents
Pages: 51 - 61
Year of Publication: 2002
ISBN ~ ISSN:1072-4451 , 0-7695-1859-1
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
IEEE Computer Society Press
Los Alamitos, CA, USA
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 6, Citation Count: 3
|
|
|
ABSTRACT
In this paper we present an approach to boosting performance and tolerating latency by deferring non-critical instructions into a deferred queue for later processing. As such, instruction deferral allows more critical instructions to be fetched, dispatched, and possibly executed, earlier.We present methods for identifying deferrable instructions using previously investigated notions of instruction slack. In particular we use static slack to determine if an instruction is deferrable. The static slack of an instruction corresponds to the number of cycles an instruction can be delayed without impacting overall execution time when considering all dynamic paths from that instruction. A significant fraction of the dynamic instruction stream has enough static slack to be deferred by 10 or more cycles on an aggressive execution model. Futhermore, the small amount of register-based communication from deferred instructions to non-deferred instructions makes a deferral-based approach to fetch and execution very attractive.We use a trace cache based microarchitecture to overcome some significant implementation challenges associated with instruction deferral. Overall, instruction deferral boosts the performance of a 4-wide processor by approximately 11% and an 8-wide processor by 6% on eight of the SPEC2000 integer benchmarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
|
 |
4
|
|
| |
5
|
|
 |
6
|
Sanjay J. Patel , Tony Tung , Satarupa Bose , Matthew M. Crum, Increasing the size of atomic instruction blocks using control flow assertions, Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, p.303-313, December 2000, Monterey, California, United States
[doi> 10.1145/360128.360160]
|
 |
7
|
Glenn Reinman , Todd Austin , Brad Calder, A scalable front-end architecture for fast instruction delivery, Proceedings of the 26th annual international symposium on Computer architecture, p.234-245, May 01-04, 1999, Atlanta, Georgia, United States
|
| |
8
|
Jared Stark , Paul Racunas , Yale N. Patt, Reducing the performance impact of instruction cache misses by writing instructions into the reservation stations out-of-order, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.34-43, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
9
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
|