|
ABSTRACT
Load-reuse analysis finds instructions that repeatedly access the same memory location. This location can be promoted to a register, eliminating redundant loads by reusing the results of prior memory accesses. This paper develops a load-reuse analysis and designs a method for evaluating its precision.In designing the analysis, we aspire for completeness---the goal of exposing all reuse that can be harvested by a subsequent program transformation. For register promotion, a suitable transformation is partial redundancy elimination (PRE). To approach the ideal goal of PRE-completeness, the load-reuse analysis is phrased as a data-flow problem on a program representation that is path-sensitive, as it detects reuse even when it originates in a different instruction along each control flow path. Furthermore, the analysis is comprehensive, as it treats scalar, array and pointer-based loads uniformly.In evaluating the analysis, we compare it with an ideal analysis. By observing the run-time stream of memory references, we collect all PRE-exploitable reuse and treat it as the ideal analysis performance. To compare the (static) load-reuse analysis with the (dynamic) ideal reuse, we use an estimator algorithm that computes, given a data-flow solution and a program profile, the dynamic amount of reuse detected by the analysis. We developed a family of estimators that differ in how well they bound the profiling error inherent in the edge profile. By bounding the error, the estimators offer a precise and practical method for determining the run-time optimization benefit.Our experiments show that about 55% of loads executed in Spec95 exhibit reuse. Of those, our analysis exposes about 80%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Ole Agesen , Urs Hölzle, Type feedback vs. concrete type inference: a comparison of optimization techniques for object-oriented languages, Proceedings of the tenth annual conference on Object-oriented programming systems, languages, and applications, p.91-107, October 15-19, 1995, Austin, Texas, United States
|
| |
2
|
|
 |
3
|
Thomas Ball , Peter Mataga , Mooly Sagiv, Edge profiling versus path profiling: the showdown, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.134-148, January 19-21, 1998, San Diego, California, United States
[doi> 10.1145/268946.268958]
|
| |
4
|
|
 |
5
|
|
| |
6
|
|
 |
7
|
|
 |
8
|
Rastislav Bodík , Rajiv Gupta , Mary Lou Soffa, Interprocedural conditional branch elimination, Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, p.146-158, June 16-18, 1997, Las Vegas, Nevada, United States
|
 |
9
|
|
 |
10
|
Rastislav Bodík , Rajiv Gupta , Mary Lou Soffa, Complete removal of redundant expressions, Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, p.1-14, June 17-19, 1998, Montreal, Quebec, Canada
|
 |
11
|
|
 |
12
|
|
 |
13
|
Steve Carr , Kathryn S. McKinley , Chau-Wen Tseng, Compiler optimizations for improving data locality, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.252-262, October 05-07, 1994, San Jose, California, United States
|
| |
14
|
|
 |
15
|
|
 |
16
|
Pohua P. Chang , Scott A. Mahlke , William Y. Chen , Nancy J. Warter , Wen-mei W. Hwu, IMPACT: an architectural framework for multiple-instruction-issue processors, Proceedings of the 18th annual international symposium on Computer architecture, p.266-275, May 27-30, 1991, Toronto, Ontario, Canada
|
 |
17
|
|
 |
18
|
|
 |
19
|
Evelyn Duesterwald , Rajiv Gupta , Mary Lou Soffa, A practical data flow framework for array reference analysis and its use in optimizations, Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation, p.68-77, June 21-25, 1993, Albuquerque, New Mexico, United States
|
| |
20
|
Benjamin Goldberg, Hansoo Kim, Vinod Kathail, and John Gyllenhaal. The trimaran compiler infrastructure for instruction level parallelism research. Technical Report he tp: / / www. trimaran, o rg, Hewlett-Packard Laboratories, University of Illinois, NYU, 1998.
|
| |
21
|
Rajiv Gupta , David A. Berson , Jesse Z. Fang, Resource-sensitive profile-directed data flow analysis for code optimization, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.358-368, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
22
|
Wen-Mei W. Hwu , Scott A. Mahlke , William Y. Chen , Pohua P. Chang , Nancy J. Warter , Roger A. Bringmann , Roland G. Ouellette , Richard E. Hank , Tokuzo Kiyohara , Grant E. Haab , John G. Holm , Daniel M. Lavery, The superblock: an effective technique for VLIW and superscalar compilation, The Journal of Supercomputing, v.7 n.1-2, p.229-248, May 1993
[doi> 10.1007/BF01205185]
|
| |
23
|
Vinod Kathail, Michael S. Schlansker, and B. Ramakrishna Rau. Hpl playdoh architecture specification: Version 1.0. Technical Report HPL-93-80, Hewlett-Packard Laboratories, 1994.
|
 |
24
|
|
| |
25
|
James Larus and Satish Chandra. Using tracing and dynamic slicing to tune compilers. Technical Report TR-1174, University of Wisconsin, 1993.
|
 |
26
|
Raymond Lo , Fred Chow , Robert Kennedy , Shin-Ming Liu , Peng Tu, Register promotion by sparse partial redundancy elimination of loads and stores, Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, p.26-37, June 17-19, 1998, Montreal, Quebec, Canada
|
 |
27
|
|
 |
28
|
|
 |
29
|
|
| |
30
|
|
 |
31
|
|
 |
32
|
|
| |
33
|
Glenn Reinman, Brad Calder, Dean Tullsen, Gary Tyson, and Todd Austin. Profile guided load marking for memory renaming. Technical Report UCSD-CS98-593, University of California, San Diego, 1998.
|
 |
34
|
B. K. Rosen , M. N. Wegman , F. K. Zadeck, Global value numbers and redundant computations, Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.12-27, January 10-13, 1988, San Diego, California, United States
[doi> 10.1145/73560.73562]
|
 |
35
|
|
| |
36
|
|
| |
37
|
|
| |
38
|
Youfeng Wu. Conflict Ratio Profiling for Memory References. Technical Report MRL Compiler Technical Report 96012, Intel Corp., 1996.
|
 |
39
|
|
CITED BY 14
|
|
|
|
|
|
|
Jin Lin , Tong Chen , Wei-Chung Hsu , Pen-Chung Yew , Roy Dz-Ching Ju , Tin-Fook Ngai , Sun Chan, A compiler framework for speculative analysis and optimizations, ACM SIGPLAN Notices, v.38 n.5, May 2003
|
|
Jin Lin , Tong Chen , Wei-Chung Hsu , Pen-Chung Yew , Roy Dz-Ching Ju , Tin-Fook Ngai , Sun Chan, A compiler framework for speculative optimizations, ACM Transactions on Architecture and Code Optimization (TACO), v.1 n.3, p.247-271, September 2004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
|