Article

Extracting SMP parallelism for dense linear algebra algorithms from high-level specifications

Authors:

Tze Meng Low,

Robert A. van de Geijn,

Field G. Van ZeeAuthors Info & Claims

PPoPP '05: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming

Pages 153 - 163

https://doi.org/10.1145/1065944.1065965

Published: 15 June 2005 Publication History

Get Access

Abstract

We show how to exploit high-level information, available as part of the derivation of provably correct algorithms, so that SMP parallelism can be systematically identified. Recent research has shown that loop-based dense linear algebra algorithms can be systematically derived from the mathematical specification of the operation. Fundamental to the methodology is the determination of loop-invariants (in the sense of Dijkstra and Hoare) from which correct loops can be systematically derived. We show how the high-level specification of the operation together with these loop-invariants can be exploited to detect the independence of loop iterations. This in turn then allows a Workqueuing Model to be used to implement and parallelize the algorithms using a feature proposed for OpenMP 3.0, task queues. Although performance is not the main feature of this paper, performance is reported on a 4 CPU Itanium2 server for a concrete example, the symmetric rank-k update operation.

References

[1]

Bientinesi, P., Gunnels, J. A., Myers, M. E., Quintana-Ortí, E. S., and van de Geijn, R. A. The science of deriving dense linear algebra algorithms. ACM Transactions on Mathematical Software 31, 1 (Mar. 2005).

Digital Library

Google Scholar

[2]

Bientinesi, P., Quintana-Ortí, E. S., and van de Geijn, R. A. Representing linear algebra algorithms in code: The FLAME application programming interfaces. ACM Transactions on Mathematical Software 31, 1 (Mar. 2005).

Digital Library

Google Scholar

[3]

Dongarra, J. J., DuCroz, J., Hammarling, S., and Duff, I. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software 16 (1990), 1--28.

Digital Library

Google Scholar

[4]

Goto, K., 2004.

Google Scholar

[5]

Gunnels, J. A., Gustavson, F. G., Henry, G. M., and van de Geijn, R. A. Flame: Formal linear algebra methods environment. ACM Transactions on Mathematical Software 27, 4 (2001), 422--455.

Digital Library

Google Scholar

[6]

Lim, A. W., and Lam, M. S. Maximizing parallelism and minimizing synchronization with affine partitions. Parallel Comput. 24, 3-4 (1998), 445--475.

Digital Library

Google Scholar

[7]

Low, T. M., Milfeld, K., van deGeijn, R., and van Zee, F. Parallelizing--ame code with openmp task queues. Tech. Rep. TR 2004-50, The University of Texas at Austin, Department of Computer Sciences, 2004.

Google Scholar

[8]

Moler, C., Little, J., and Bangert, S. Pro-Matlab, User's Guide The Mathworks Inc., 1987.

Google Scholar

[9]

Quintana-Ortí, E. S., and van de Geijn, R. A. Formal derivation of algorithms: The triangular sylvester equation. ACM Trans. Math. Softw. 29, 2 (2003), 218--243.

Digital Library

Google Scholar

[10]

Shah, S., Haab, G., Petersen, P., and Throop, J. Flexible control structures for parallelism in openmp. In First European Workshop on OpenMP (2002).

Google Scholar

[11]

Shah, S., Haab, G., Peterson, P., and Throop, J. Flexible control structures for parallelism in OpenMP. In First European Workshop on OpenMP (1999).

Google Scholar

[12]

Singhai, S., and McKinley, K. A parameterized loop fusion algorithm for improving parallelism andcache locality, 1997.

Google Scholar

[13]

Snir, M., Otto, S. W., Huss-Lederman, S., Walker, D. W., and Dongarra, J. MPI: The Complete Reference The MIT Press, 1996.

Digital Library

Google Scholar

[14]

van de Geijn, R. A. Using PLAPACK: Parallel Linear Algebra Package The MIT Press, 1997.

Digital Library

Google Scholar

Cited By

View all

van de Geijn RMyers M(2022)Applying Dijkstra’s Vision to Numerical SoftwareEdsger Wybe Dijkstra10.1145/3544585.3544597(215-230)Online publication date: 12-Jul-2022
https://dl.acm.org/doi/10.1145/3544585.3544597
Lee MLow T(2017)A Family of Provably Correct Algorithms for Exact Triangle CountingProceedings of the First International Workshop on Software Correctness for HPC Applications10.1145/3145344.3145484(14-20)Online publication date: 12-Nov-2017
https://dl.acm.org/doi/10.1145/3145344.3145484
Goto KGeijn R(2008)Anatomy of high-performance matrix multiplicationACM Transactions on Mathematical Software10.1145/1356052.135605334:3(1-25)Online publication date: 16-May-2008
https://dl.acm.org/doi/10.1145/1356052.1356053
Show More Cited By

Index Terms

Extracting SMP parallelism for dense linear algebra algorithms from high-level specifications
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
  2. Software organization and properties
    1. Software system structures
      1. Software architectures

Recommendations

The science of deriving dense linear algebra algorithms

In this article we present a systematic approach to the derivation of families of high-performance algorithms for a large set of frequently encountered dense linear algebra operations. As part of the derivation a constructive proof of the correctness of ...
Formal derivation of algorithms: The triangular sylvester equation

In this paper we apply a formal approach for the derivation of dense linear algebra algorithms to the triangular Sylvester equation. The result is a large family of provably correct algorithms. By using a coding style that reflects the algorithms as ...
Transformations techniques for extracting parallelism in non-uniform nested loops

Executing a program in parallel machines needs not only to find sufficient parallelism in a program, but it is also important that we minimize the synchronization and communication overheads in the parallelized program. This yields to improve the ...

Comments

Information & Contributors

Information

Published In

PPoPP '05: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming

June 2005

310 pages

ISBN:1595930809

DOI:10.1145/1065944

General Chair:
Keshav Pingali
Cornell University
,
Program Chairs:
Katherine Yelick
University of California, Berkeley and LBNL
,
Andrew Grimshaw
University of Virginia

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

PPoPP05

Sponsor:

PPoPP05: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2005

June 15 - 17, 2005

IL, Chicago, USA

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
328
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

van de Geijn RMyers M(2022)Applying Dijkstra’s Vision to Numerical SoftwareEdsger Wybe Dijkstra10.1145/3544585.3544597(215-230)Online publication date: 12-Jul-2022
https://dl.acm.org/doi/10.1145/3544585.3544597
Lee MLow T(2017)A Family of Provably Correct Algorithms for Exact Triangle CountingProceedings of the First International Workshop on Software Correctness for HPC Applications10.1145/3145344.3145484(14-20)Online publication date: 12-Nov-2017
https://dl.acm.org/doi/10.1145/3145344.3145484
Goto KGeijn R(2008)Anatomy of high-performance matrix multiplicationACM Transactions on Mathematical Software10.1145/1356052.135605334:3(1-25)Online publication date: 16-May-2008
https://dl.acm.org/doi/10.1145/1356052.1356053
Zee FBientinesi PLow TGeijn R(2008)Scalable parallelization of FLAME code via the workqueuing modelACM Transactions on Mathematical Software10.1145/1326548.132655234:2(1-29)Online publication date: 19-Mar-2008
https://dl.acm.org/doi/10.1145/1326548.1326552

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

The science of deriving dense linear algebra algorithms

Formal derivation of algorithms: The triangular sylvester equation

Transformations techniques for extracting parallelism in non-uniform nested loops

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations