|
ABSTRACT
With the rapid improvement of processor speed, performance of the memory hierarchy has become the principal bottleneck for most applications. A number of compiler transformations have been developed to improve data reuse in cache and registers, thus reducing the total number of direct memory accesses in a program. Until now, however, most data reuse transformations have been static---applied only at compile time. As a result, these transformations cannot be used to optimize irregular and dynamic applications, in which the data layout and data access patterns remain unknown until run time and may even change during the computation.In this paper, we explore ways to achieve better data reuse in irregular and dynamic applications by building on the inspector-executor method used by Saltz for run-time parallelization. In particular, we present and evaluate a dynamic approach for improving both computation and data locality in irregular programs. Our results demonstrate that run-time program transformations can substantially improve computation and data locality and, despite the complexity and cost involved, a compiler can automate such transformations, eliminating much of the associated run-time overhead.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
W. Abu-Sufah, D. Kuck, and D. Lawrie. On the performance enhancement of paging systems through program analysis and transformations. IEEE Transactions on Computers, C-30(5):341- 356, May 1981.
|
| |
2
|
|
 |
3
|
Jennifer M. Anderson , Saman P. Amarasinghe , Monica S. Lam, Data and computation transformations for multiprocessors, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.166-178, July 19-21, 1995, Santa Barbara, California, United States
|
 |
4
|
Brad Calder , Chandra Krintz , Simmi John , Todd Austin, Cache-conscious data placement, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.139-149, October 02-07, 1998, San Jose, California, United States
|
 |
5
|
|
| |
6
|
John B. Carter , Wilson C. Hsieh , Mark R. Swanson , Lixin Zhang , Erik Brunvand , Al Davis , Chen-Chi Kuo , Ravindra Kuramkote , Michael Parker , Lambert Schaelicke , Leigh Stoller , Terry Tateyama, Memory System Support for Irregular Applications, Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, p.17-26, May 28-30, 1998
|
 |
7
|
Rohit Chandra , Ding-Kai Chen , Robert Cox , Dror E. Maydan , Nenad Nedeljkovic , Jennifer M. Anderson, Data distribution support on distributed shared memory multiprocessors, Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, p.334-345, June 16-18, 1997, Las Vegas, Nevada, United States
|
 |
8
|
|
| |
9
|
R. Das, D. Mavriplis, J. Saltz, S. Gupta, and R. Ponnusamy. The design and implementation of a parallel unstructured euler solver using software primitives. In Proceedings of the 30th Aerospace Science Meeting, Reno, Navada, January 1992.
|
| |
10
|
|
| |
11
|
C. Ding. Improving effective bandwidth on machines with complex memory hierarchy. Thesis Proposal, Rice University, November 1998.
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
 |
15
|
|
 |
16
|
|
 |
17
|
|
| |
18
|
j. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. Technical Report TR 99-336, Department of Computer Science, Rice University, Feburary 1999.
|
 |
19
|
|
| |
20
|
|
 |
21
|
|
 |
22
|
|
CITED BY 41
|
|
|
|
|
|
|
|
|
|
|
E. Gutiérrez , O. Plata , E. L. Zapata, A compiler method for the parallel execution of irregular reductions in scalable shared memory multiprocessors, Proceedings of the 14th international conference on Supercomputing, p.78-87, May 08-11, 2000, Santa Fe, New Mexico, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Brian S. White , Sally A. McKee , Bronis R. de Supinski , Brian Miller , Daniel Quinlan , Martin Schulz, Improving the computational intensity of unstructured mesh applications, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tushar Mohan , Bronis R. de Supinski , Sally A. McKee , Frank Mueller , Andy Yoo , Martin Schulz, Identifying and Exploiting Spatial Regularity in Data Memory References, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p.49, November 15-21, 2003
|
|
Martin Schulz , Brian S. White , Sally A. McKee , Hsien-Hsin S. Lee , Jürgen Jeitner, Owl: next generation system monitoring, Proceedings of the 2nd conference on Computing frontiers, May 04-06, 2005, Ischia, Italy
|
|
|
|
|
|
|
|
|
Y. Charlie Hu , Alan Cox , Willy Zwaenepoel, Improving fine-grained irregular shared-memory benchmarks by data reordering, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.33-es, November 04-10, 2000, Dallas, Texas, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jason Cong , Yiping Fan , Guoling Han , Wei Jiang , Zhiru Zhang, Behavior and communication co-optimization for systems with sequential communication media, Proceedings of the 43rd annual conference on Design automation, July 24-28, 2006, San Francisco, CA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
|