|
ABSTRACT
Growing on-chip wire delays are motivating architectural features that expose on-chip communication to the compiler. EDGE architectures are one example of communication-exposed microarchitectures in which the compiler forms dataflow graphs that specify how the microarchitecture maps instructions onto a distributed execution substrate. This paper describes a compiler scheduling algorithm called spatial path scheduling that factors in previously fixed locations - called anchor points - for each placement. This algorithm extends easily to different spatial topologies. We augment this basic algorithm with three heuristics: (1) local and global ALU and network link contention modeling, (2) global critical path estimates, and (3) dependence chain path reservation. We use simulated annealing to explore possible performance improvements and to motivate the augmented heuristics and their weighting functions. We show that the spatial path scheduling algorithm augmented with these three heuristics achieves a 21% average performance improvement over the best prior algorithm and comes within an average of 5% of the annealed performance for our benchmarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
S.J. Beaty and P.H. Sweany. Instruction scheduling using simulated annealing. In International Conference on Massively Parallel Computing Systems, Colorado Springs, CO, Apr. 1998.
|
| |
3
|
|
| |
4
|
Doug Burger , Stephen W. Keckler , Kathryn S. McKinley , Mike Dahlin , Lizy K. John , Calvin Lin , Charles R. Moore , James Burrill , Robert G. McDonald , William Yoder , the TRIPS Team, Scaling to the End of Silicon with EDGE Architectures, Computer, v.37 n.7, p.44-55, July 2004
[doi> 10.1109/MC.2004.65]
|
 |
5
|
|
| |
6
|
|
 |
7
|
|
 |
8
|
Joseph A. Fisher , John R. Ellis , John C. Ruttenberg , Alexandru Nicolau, Parallel processing: a smart compiler and a dumb machine, Proceedings of the 1984 SIGPLAN symposium on Compiler construction, p.37-47, June 17-22, 1984, Montreal, Canada
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
| |
12
|
S. Kirkpatrick, C.D. Gelatt Jr., and M.P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671--680, 1983.
|
| |
13
|
|
| |
14
|
|
 |
15
|
Martha Mercaldi , Steven Swanson , Andrew Petersen , Andrew Putnam , Andrew Schwerin , Mark Oskin , Susan J. Eggers, Modeling instruction placement on a spatial architecture, Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures, July 30-August 02, 2006, Cambridge, Massachusetts, USA
[doi> 10.1145/1148109.1148137]
|
| |
16
|
Eliot Moss , Paul Utgoff , John Cavazos , Carla Brodley , David Scheeff , Doina Precup , Darko Stefanović, Learning to schedule straight-line code, Proceedings of the 1997 conference on Advances in neural information processing systems 10, p.929-935, July 1998, Denver, Colorado, United States
|
| |
17
|
Ramadass Nagarajan , Sundeep K. Kushwaha , Doug Burger , Kathryn S. McKinley , Calvin Lin , Stephen W. Keckler, Static Placement, Dynamic Issue (SPDI) Scheduling for EDGE Architectures, Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, p.74-84, September 29-October 03, 2004
[doi> 10.1109/PACT.2004.26]
|
| |
18
|
R. Nagarajan, X. Chen, R.G. McDonald, D. Burger, and S.W. Keckler. Critical path analysis of the TRIPS architecture. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2006.
|
| |
19
|
|
 |
20
|
|
| |
21
|
|
| |
22
|
Aaron Smith , Jon Gibson , Bertrand Maher , Nick Nethercote , Bill Yoder , Doug Burger , Kathryn S. McKinle , Jim Burrill, Compiling for EDGE Architectures, Proceedings of the International Symposium on Code Generation and Optimization, p.185-195, March 26-29, 2006
[doi> 10.1109/CGO.2006.10]
|
| |
23
|
|
| |
24
|
S. Swanson, K. Michelson, and M. Oskin. Configuration by combustion: Online simulated annealing for dynamic hardware configuration. In ASPLOS X Wild and Crazy Idea Session, 2002.
|
| |
25
|
Elliot Waingold , Michael Taylor , Devabhaktuni Srikrishna , Vivek Sarkar , Walter Lee , Victor Lee , Jang Kim , Matthew Frank , Peter Finch , Rajeev Barua , Jonathan Babb , Saman Amarasinghe , Anant Agarwal, Baring It All to Software: Raw Machines, Computer, v.30 n.9, p.86-93, September 1997
[doi> 10.1109/2.612254
]
|
| |
26
|
J. Zalamea, J. Llosa, E. Ayguade, and M. Valero. Software and hardware techniques to optimize register file utilization in VLIW architectures. In Proceedings of the International Workshop on Advanced Compiler Technology for High Performance and Embedded Systems (IWACT), July 2001.
|
CITED BY 4
|
Martha Mercaldi , Steven Swanson , Andrew Petersen , Andrew Putnam , Andrew Schwerin , Mark Oskin , Susan J. Eggers, Instruction scheduling for a tiled dataflow architecture, ACM SIGOPS Operating Systems Review, v.40 n.5, December 2006
|
|
|
|
|
|
Aaron Smith , Ramadass Nagarajan , Karthikeyan Sankaralingam , Robert McDonald , Doug Burger , Stephen W. Keckler , Kathryn S. McKinley, Dataflow Predication, Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, p.89-102, December 09-13, 2006
|
|
|
|