| Identifying potential parallelism via loop-centric profiling |
| Full text |
Pdf
(278 KB)
|
Source
|
Conference On Computing Frontiers
archive
Proceedings of the 4th international conference on Computing frontiers
table of contents
Ischia, Italy
SESSION: Software for high-performance systems
table of contents
Pages: 143 - 152
Year of Publication: 2007
ISBN:978-1-59593-683-7
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 108, Citation Count: 0
|
|
|
ABSTRACT
The transition to multithreaded, multi-core designs places a greater responsibility on programmers and software for improving performance; thread-level parallelism (TLP) will be increasingly relied upon in addition to instruction-level parallelism (ILP) and increased clock frequency. Deciding where to try to parallelize code is difficult, especially for large, complex applications or those where the original developers have moved on. Outer loops are relatively easy targets for parallelization, but traditional profilers focus primarily on functions and hot inner loops. To aid in programmers' parallelization efforts, we introduce the concept of loop-centric profiling to provide a hierarchical view of how much time is spent in a loop and the loops nested within it.This paper introduces two techniques for loop profiling. First, we describe an instrumentation-based approach that gathers highly detailed and accurate information about loop behavior. Second, we present a sampling approach that achieves similar results with negligible overhead. The paper concludes with a case study evaluating the tool on several SPEC 2000 benchmarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
The libunwind project. http://www.hpl.hp.com/research/linux/libunwind/.
|
| |
2
|
OProfile. http://oprofile.sourceforge.net.
|
| |
3
|
M. R. D. Alba, D. R. Kaeli, and E. Kim. Analisis dinamico de bloques iterativos. In Tercer Congreso Internacional en Control, Instrumentacion Virtual y Sistemas Digitales, pages 93--106, Ciudad de Mexico, Mexico, Agosto 2001.
|
 |
4
|
Jennifer M. Anderson , Lance M. Berc , Jeffrey Dean , Sanjay Ghemawat , Monika R. Henzinger , Shun-Tak A. Leung , Richard L. Sites , Mark T. Vandevoorde , Carl A. Waldspurger , William E. Weihl, Continuous profiling: where have all the cycles gone?, ACM Transactions on Computer Systems (TOCS), v.15 n.4, p.357-390, Nov. 1997
[doi> 10.1145/265924.265925]
|
| |
5
|
|
| |
6
|
M. de Alba and D. Kaeli. Characterization and evaluation of hardware loop unrolling. Technical report, Electrical and Computer Engineering Department at Northeastern University, 2002.
|
| |
7
|
M. de Alba and D. Kaeli. Path-based hardware loop prediction. In The 4th International Conference on Control, Virtual Instrumentation and Digital Systems, August 2002.
|
| |
8
|
Jeffrey Dean , James E. Hicks , Carl A. Waldspurger , William E. Weihl , George Chrysos, ProfileMe: hardware support for instruction-level profiling on out-of-order processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.292-302, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
9
|
J. Ellson, E. Gansner, E. Koutsofios, S. North, and G. Woodhull. Graphviz and dynagraph static and dynamic graph drawing tools. In M. Junger and P. Mutzel, editors, Graph Drawing Software, pages 127--148. Springer-Verlag, 2003.
|
 |
10
|
Susan L. Graham , Peter B. Kessler , Marshall K. Mckusick, Gprof: A call graph execution profiler, Proceedings of the 1982 SIGPLAN symposium on Compiler construction, p.120-126, June 23-25, 1982, Boston, Massachusetts, United States
|
| |
11
|
M. Kobayashi. Dynamic characteristics of loops. IEEE Trans. Computers, 33(2):125--132, 1984.
|
 |
12
|
Chi-Keung Luk , Robert Cohn , Robert Muth , Harish Patil , Artur Klauser , Geoff Lowney , Steven Wallace , Vijay Janapa Reddi , Kim Hazelwood, Pin: building customized program analysis tools with dynamic instrumentation, Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, June 12-15, 2005, Chicago, IL, USA
|
| |
13
|
T. Moseley, D. Grunwald, D. A. Connors, R. Ramanujam, V. Tovinkere, and R. Peri. LoopProf: Dynamic Techniques for Loop Detection and Profiling. In Proceedings of the 2006 Workshop on Binary Instrumentation and Applications (WBIA), 2006.
|
| |
14
|
|
| |
15
|
Alex Shye , Matthew Iyer , Tipp Moseley , David Hodgdon , Dan Fay , Vijay Janapa Reddi , Daniel A. Connors, Analyis of Path Profiling Information Generated with Performance Monitoring Hardware, Proceedings of the 9th Annual Workshop on Interaction between Compilers and Computer Architectures (INTERACT'05), p.34-43, February 13-13, 2005
[doi> 10.1109/INTERACT.2005.3]
|
| |
16
|
Standard Performance Evaluation Corporation. The SPEC CPU 2000 benchmark suite, 2000.
|
| |
17
|
|
|