research-article

Performance and productivity of parallel python programming: a study with a CFD test case

Authors:
Achim Basermann

German Aerospace Center (DLR), Simulation and Software Technology, Linder Höhe, Cologne, Germany

German Aerospace Center (DLR), Simulation and Software Technology, Linder Höhe, Cologne, Germany
View Profile

,
Melven Röhrig-Zöllner

German Aerospace Center (DLR), Simulation and Software Technology, Linder Höhe, Cologne, Germany

German Aerospace Center (DLR), Simulation and Software Technology, Linder Höhe, Cologne, Germany
View Profile

,
Joachim Illmer

German Aerospace Center (DLR), Simulation and Software Technology, Linder Höhe, Cologne, Germany

German Aerospace Center (DLR), Simulation and Software Technology, Linder Höhe, Cologne, Germany
View Profile

PyHPC '15: Proceedings of the 5th Workshop on Python for High-Performance and Scientific ComputingNovember 2015Article No.: 2Pages 1–10https://doi.org/10.1145/2835857.2835859

Published:15 November 2015Publication History

PyHPC '15: Proceedings of the 5th Workshop on Python for High-Performance and Scientific Computing

Pages 1–10

ABSTRACT

The programming language Python is widely used to create rapidly compact software. However, compared to low-level programming languages like C or Fortran low performance is preventing its use for HPC applications. Efficient parallel programming of multi-core systems and graphic cards is generally a complex task. Python with add-ons might provide a simple approach to program those systems. This paper evaluates the performance of Python implementations with different libraries and compares it to implementations in C or Fortran. As a test case from the field of computational fluid dynamics (CFD) a part of a rotor simulation code was selected. Fortran versions of this code were available for use on single-core, multi-core and graphic-card systems. For all these computer systems, multiple compact versions of the code were implemented in Python with different libraries. For performance analysis of the rotor simulation kernel, a performance model was developed. This model was then employed to assess the performance reached with the different implementations. Performance tests showed that an implementation with Python syntax is six times slower than Fortran on single-core systems. The performance on multi-core systems and graphic cards is about a tenth of the Fortran implementations. A higher performance was achieved by a hybrid implementation in C and Python using Cython. The latter reached about half of the performance of the Fortran implementation.

References

H. M. Atassi. The biot-savart law. https://www3.nd.edu/~atassi/Teaching/ame%2060639/Notes/biotsavart.pdf, 2015. Accessed: 4th September 2015.Google Scholar
A. Basermann, M. Röhrig-Zöllner, and J. Hoffmann. Porting a parallel rotor wake simulation to gpgpu accelerators using openacc. http://www.t-systems-sfr.com/e/deu/abstract.2014_7.php, 2014. Accessed: 3rd September 2015.Google Scholar
Blas --- basic linear algebra subprograms. http://www.netlib.org/blas/, 2015. Accessed: 4th September 2015.Google Scholar
D. A. Boxwell, F. H. Schmitz, W. R. Splettstößer, and K. J. Schultz. Helicopter model rotor-blade vortex interaction impulsive noise: Scalability and parametric variations. Journal of the American Helicopter Society, 32(1):3--12, 1. Januar 1987.Google ScholarCross Ref
Cuda toolkit documentation - multiprocessor level. http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#multiprocessor-level, 2015. Accessed: 3rd September 2015.Google Scholar
The cython compiler for writing c extensions for the python language. https://pypi.python.org/pypi/Cython/, 2015. Accessed: 3rd September 2015.Google Scholar
Using the cython compiler to write fast python code. http://www.behnel.de/cython200910/talk.html, 2015. Accessed: 3rd September 2015.Google Scholar
J. Daily, P. Saddayappan, B. Palmer, S. K. Manojkumar Krishnan, A. Vishnu, D. Chavarría, and P. Nichols. High performance computing in python using numpy and the global arrays toolkit, 08 2011. Remarks by Chairman Alan Greenspan at the Annual Dinner and Francis Boyer Lecture of The American Enterprise Institute for Public Policy Research, Washington, D.C. {Accessed: 3rd September 2015}.Google Scholar
Intel xeon processor e5645 specifications. http://ark.intel.com/de/products/48768/Intel-Xeon-Processor-E5645-12M-Cache-2_40-GHz-5_86-GTs-Intel-QPI?q=e5645, 2010. Accessed: 3rd September 2015.Google Scholar
G. Hager and G. Wellein. Introduction to High Performance Computing for Scientists and Engineers. Chapman & Hall/CRC Computational Science. Taylor & Francis, 2010. Google ScholarDigital Library
Intel math kernel library. https://software.intel.com/en-us/intel-mkl, 2015. Accessed: 28th July 2015.Google Scholar
Intel xeon processor 5600 series. http://download.intel.com/support/processors/xeon/sb/xeon_5600.pdf, 2011. Accessed: 3rd September 2015.Google Scholar
Likwidbench wiki. https://code.google.com/p/likwid/wiki/LikwidBench, 2015. Accessed: 3rd September 2015.Google Scholar
Homepage of matlab. http://de.mathworks.com/products/matlab/, 2015. Accessed: 3rd September 2015.Google Scholar
The message passing interface (mpi) standard. http://www.mcs.anl.gov/research/projects/mpi/, 2015. Accessed: 3rd September 2015.Google Scholar
Numba --- mode of operation. http://on-demand.gputechconf.com/supercomputing/2013/presentation/SC3121-Programming-GPU-Python-Using-NumbaPro.pdf, 2015. Accessed: 3rd September 2015.Google Scholar
Homepage of numba. http://numba.pydata.org/, 2015. Accessed: 3rd September 2015.Google Scholar
Ways to parallelize - numba-users mailinglist. https://groups.google.com/a/continuum.io/forum/#!topic/numba-users/UN4sDSr8Iew, 2014. Accessed: 3rd September 2015.Google Scholar
Numba mailinglist. https://groups.google.com/a/continuum.io/forum/#!topic/numba-users/iOnkSJTcF0A, 2014. Accessed: 3rd September 2015.Google Scholar
Numbapro --- continuum analytics. http://docs.continuum.io/numbapro/index, 2015. Accessed: 3rd September 2015.Google Scholar
Homepage of numpy. http://www.numpy.org/, 2015. Accessed: 3rd September 2015.Google Scholar
Nvidia cuda. https://developer.nvidia.com/about-cuda, 2015. Accessed: 3rd September 2015.Google Scholar
Nvidia tesla c2075 companion processor. http://www.nvidia.de/content/PDF/data-sheet/NV_DS_Tesla_C2075_Sept11_US_HR.pdf, 2011. Accessed: 3rd September 2015.Google Scholar
Homepage der openacc api. http://www.openacc-standard.org/, 2015. Accessed: 3rd September 2015.Google Scholar
Opencl - the open standard for parallel programming of heterogeneous systems. https://www.khronos.org/opencl/, 2015. Accessed: 3rd September 2015.Google Scholar
Openmp specification for parallel programming. http://openmp.org/wp/, 2015. Accessed: 3rd September 2015.Google Scholar
W. Splettstößer, R. Kube, U. Seelhorst, W. Wagner, A. Boutier, F. Micheli, and K. Pengel. Higher harmonic control aeroacustic rotor test (hart) - test documentation and representative results. http://elib.dlr.de/36398/, 1996. Accessed: 3rd September 2015.Google Scholar
Top500 list - june 2015. http://www.top500.org/list/2015/06/, 2014. Accessed: 3rd September 2015.Google Scholar
The abstraction-optimization tradeoff. http://blog.vivekhaldar.com/post/12785508353/the-abstraction-optimization-tradeoff, 2015. Accessed: 3rd September 2015.Google Scholar
S. W. Williams, A. Waterman, and D. A. Patterson. Roofline: An insightful visual performance model for floating-point programs and multicore architectures. UCB/EECS 2008-134, Univ. of California, Berkeley, CA, oct 2008.Google Scholar

Index Terms

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

PyHPC '15: Proceedings of the 5th Workshop on Python for High-Performance and Scientific Computing
November 2015
59 pages
ISBN:9781450340106
DOI:10.1145/2835857

Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 November 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
PyHPC '15 Paper Acceptance Rate7of7submissions,100%Overall Acceptance Rate7of7submissions,100%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 287
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Performance and productivity of parallel python programming: a study with a CFD test case

PyHPC '15: Proceedings of the 5th Workshop on Python for High-Performance and Scientific Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Python High Performance Programming

Python Programming: Your Beginners Guide To Easily Learn Python in 7 Days

Parallel Programming with Python

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Performance and productivity of parallel python programming: a study with a CFD test case

PyHPC '15: Proceedings of the 5th Workshop on Python for High-Performance and Scientific Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Python High Performance Programming

Python Programming: Your Beginners Guide To Easily Learn Python in 7 Days

Parallel Programming with Python

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media