Article

On using SCALEA for performance analysis of distributed and parallel programs

Authors:
Hong-Linh Truong

University of Vienna, Austria

University of Vienna, Austria
View Profile

,
Thomas Fahringer

University of Vienna, Austria

University of Vienna, Austria
View Profile

,
Georg Madsen

Technical University, Vienna, Austria

Technical University, Vienna, Austria
View Profile

,
Allen D. Malony

University of Oregon, Eugene, OR

University of Oregon, Eugene, OR
View Profile

,
Hans Moritsch

University of Vienna, Austria

University of Vienna, Austria
View Profile

,
Sameer Shende

University of Oregon, Eugene, OR

University of Oregon, Eugene, OR
View Profile

SC '01: Proceedings of the 2001 ACM/IEEE conference on SupercomputingNovember 2001Pages 34https://doi.org/10.1145/582034.582068

Published:10 November 2001Publication History

SC '01: Proceedings of the 2001 ACM/IEEE conference on Supercomputing

Pages 34

ABSTRACT

In this paper we give an overview of SCALEA, which is a new performance analysis tool for OpenMP, MPI, HPF, and mixed parallel/distributed programs. SCALEA instruments, executes and measures programs and computes a variety of performance overheads based on a novel overhead classification. Source code and HW-profiling is combined in a single system which significantly extends the scope of possible overheads that can be measured and examined, ranging from HW-counters, such as the number of cache misses or floating point operations, to more complex performance metrics, such as control or loss of parallelism. Moreover, SCALEA uses a new representation of code regions, called the dynamic code region call graph, which enables detailed overhead analysis for arbitrary code regions. An instrumentation description file is used to relate performance information to code regions of the input program and to reduce instrumentation overhead. Several experiments with realistic codes that cover MPI, OpenMP, HPF, and mixed OpenMP/MPI codes demonstrate the usefulness of SCALEA.

References

G. M. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities. In AFIPS Conference, pages 483-485, 1967.]]Google ScholarDigital Library
M.K. Bane and G.D. Riley. Automatic overheads profilers for openmp codes. In Second European Workshop on OpenMP proceedings (EWOMP 2000), Edinburgh, Scotland, September 2000.]]Google Scholar
S. Benkner. VFC: The Vienna Fortran Compiler. Scientific Programming, IOS Press, The Netherlands, 7(1):67-81, 1999.]] Google ScholarDigital Library
P. Blaha, K. Schwarz, and J. Luitz. WIEN97, Full-potential, linearized augmented plane wave package for calculating crystal properties. Institute of Technical Electrochemistry, Vienna University of Technology, Vienna, Austria, ISBN 3-9501031-0-4, 1999.]]Google Scholar
S. Browne, J. Dongarra, N. Garner, K. London, and P. Mucci. A scalable cross-platform infrastructure for application performance tuning using hardware counters. In Proceeding SC'2000, November 2000.]] Google ScholarDigital Library
J.M. Bull. A hierarchical classification of overheads in parallel programs. In P. Croll I. Jelly, I. Gorton, editor, Proceedings of Firs IFIP TC10 International Workshop on Software Engineering for Parallel and Distributed Systems, pages 208-219. Chapman Hall, March 1996.]] Google ScholarDigital Library
Harold W. Cain, Barton P. Miller, and Brian J.N. Wylie. A callgraph-based search strategy for automated performance diagnosis. In Euro-Par 2000 Parallel Processing, pages 108-122, 2000.]] Google ScholarDigital Library
E. Dockner and H. Moritsch. Pricing Constant Maturity Floaters with Embeeded Options Using Monte Carlo Simulation. Technical Report AuR_99-04, AURORA Technical Reports, University of Vienna, January 1999.]]Google Scholar
T. Fahringer, B. Scholz, and X. Sun. Execution-Driven Performance Analysis for Distributed and Parallel Systems. In Proc. of the 2nd International ACM Sigmetrics Workshop on Software and Performance (WOSP'2000), Ottawa, Canada, September 2000. ACM Press.]] Google ScholarDigital Library
Jay Fenlason and Richard Stallman. GNU gprof. Free Software Foundation, Inc., September 1997.]]Google Scholar
Susan L. Graham, Peter B. Kessler, and Marshall K. McKusick. gprof: A call graph execution profiler. SIGPLAN Notices, 17(6):120-126, June 1982. Proceedings of the ACM SIGPLAN '82 Symposium on Compiler Construction.]] Google ScholarDigital Library
W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A high-performance, portable implementation of the MPI message passing interface standard. Parallel Computing, 22(6):789-828, September 1996.]] Google ScholarDigital Library
R. Hempel. The MPI standard for message passing. Lecture Notes in Computer Science, 797:247-252, 1994.]] Google ScholarDigital Library
Hewlett Packard. CXperf User's Guide, June 1998.]]Google Scholar
High Performance Fortran Forum. High Performance Fortran Language Specification. Technical report, Rice University, Houston, TX, November 1994.]]Google Scholar
Vipin Kumar, Ananth Grama, Anshul Gupta, and George Karypis. Introduction to Parallel Computing:design and analysis of parallel algorithms. Benjamin/Cummings, 1994.]] Google ScholarDigital Library
Allen Malony and Sameer Shende. Performance technology for complex parallel and distributed systems. In In G. Kotsis and P. Kacsuk (Eds.), Third International Austrian/Hungarian Workshop on Distributed and Parallel Systems (DAPSYS 2000), pages 37-46. Kluwer Academic Publishers, Sept. 2000.]] Google ScholarDigital Library
B. Miller, M. Callaghan, J. Cargille, J. Hollingsworth, R. Irvin, K. Karavanic, K. Kunchithapadam, and T. Newhall. The paradyn parallel performance measurement tool. IEEE Computer, 28(11):37-46, November 1995.]] Google ScholarDigital Library
Bernd Mohr, Allen Malony, Sameer Shende, and Felix Wolf. Towards a performance tool interface for openmp: An approach based on directive rewriting. In EWOMP'01 Third European Workshop on Open-MPI, Sept. 2001.]]Google Scholar
W. E. Nagel, A. Arnold, M. Weber, H.-C. Hoppe, and K. Solchenbach. VAMPIR: Visualization and analysis of MPI resources. Supercomputer, 12(1):69-80, January 1996.]]Google Scholar
Pallas GmbH. Vampirtrace 2.0 Installation and User's Guide, November 1999.]]Google Scholar
Sameer Shende, Allen Malony, Janice Cuny, Kathleen Lindlan, Peter Beckman, and Steve Karmesin. Portable profiling and tracing for parallel, scientific applications using C++. In Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools (SPDT-98), pages 134-147, New York, August 3-4 1998. ACM Press.]] Google ScholarDigital Library
Gescher system. http://gescher.vcpc.univie.ac.at.]]Google Scholar
Hong-Linh Truong and Thomas Fahringer. Scalea --- a performance analysis system for distributed and parallel programs. Technical report, Institute for Software Science, University of Vienna, Liechtensteinstr. 22, A-1090 Vienna, Austria, April 2001.]]Google Scholar
Hong-Linh Truong and Thomas Fahringer. Scalea version 1.0: User's guide. Technical report, Institute for Software Science, University of Vienna, Liechtensteinstr. 22, A-1090 Vienna, Austria, April 2001.]]Google Scholar
T. Cortes V. Pillet, J. Labarta and S. Girona. Paraver: A tool to visualize and analyze parallel code. In WoTUG-18, pages 17-31, Manchester, April 1995.]]Google Scholar
OpenMP Website. http://www.openmp.org.]]Google Scholar

Index Terms

Recommendations

SCALEA: A Performance Analysis Tool for Distributed and Parallel Programs
Euro-Par '02: Proceedings of the 8th International Euro-Par Conference on Parallel Processing

In this paper we present SCALEA, which is a performance instrumentation, measurement, analysis, and visualization tool for parallel and distributed programs that supports post-mortem and online performance analysis. SCALEA currently focuses on ...
Read More
Modeling and detecting performance problems for distributed and parallel programs with JavaPSL
SC '01: Proceedings of the 2001 ACM/IEEE conference on Supercomputing

In this paper we present JavaPSL, a Performance Specification Language that can be used for a systematic and portable specification of large classes of experiment-related data and performance properties for distributed and parallel programs. Performance ...
Read More
Scaling applications to massively parallel machines using Projections performance analysis tool

Some of the most challenging applications to parallelize scalably are the ones that present a relatively small amount of computation per iteration. Multiple interacting performance challenges must be identified and solved to attain high parallel ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SC '01: Proceedings of the 2001 ACM/IEEE conference on Supercomputing
November 2001
756 pages
ISBN:158113293X
DOI:10.1145/582034
Conference Chair:
Charles Slocomb
Los Alamos National Laboratory
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 November 2001
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
distributed and parallel systems
performance analysis
performance overhead classification
Qualifiers
- Article
Conference

Acceptance Rates
SC '01 Paper Acceptance Rate60of240submissions,25%Overall Acceptance Rate1,516of6,373submissions,24%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 342
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On using SCALEA for performance analysis of distributed and parallel programs

SC '01: Proceedings of the 2001 ACM/IEEE conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

SCALEA: A Performance Analysis Tool for Distributed and Parallel Programs

Modeling and detecting performance problems for distributed and parallel programs with JavaPSL

Scaling applications to massively parallel machines using Projections performance analysis tool

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

On using SCALEA for performance analysis of distributed and parallel programs

SC '01: Proceedings of the 2001 ACM/IEEE conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

SCALEA: A Performance Analysis Tool for Distributed and Parallel Programs

Modeling and detecting performance problems for distributed and parallel programs with JavaPSL

Scaling applications to massively parallel machines using Projections performance analysis tool

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media