skip to main content
10.1145/2832105.2832110acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Accelerating the multi-zone scalar pentadiagonal CFD algorithm with OpenACC

Published:15 November 2015Publication History

ABSTRACT

The multi-zone scalar pentadiagonal (SP-MZ) benchmark, part of the multi-zone NAS Parallel Benchmark suite, is ported to graphics processing units (GPUs) using OpenACC compiler directives. The sequence of optimizations necessary to transform the SP-MZ algorithm from CPU-oriented to GPU-oriented is presented. The performance of the OpenACC implementation on GPUs is measured using predefined mesh sizes. We observe a 30% speed-up using the OpenACC implement on an NVIDIA Kepler K40 GPU compared to an eight-core Intel Xeon E5-2670 CPU with the small Class-A mesh (256 thousand points). Setting inter-zone boundary conditions directly on the device reduced run-time by 22% due to the high cost of host-device communication. Multi-device benchmarks with the larger Class-C mesh (4.3 million points) were scaled to 32 GPU nodes and matched or outperformed the CPU baseline with ten cores per node. Combining both CPU and GPU computing power improved the throughput on the Class-C mesh by 75%. We define a larger zone size with one million points per node to better reflect modern usage with codes similar to SP-MZ. The OpenACC GPU implementation outperformed the baseline multi-core CPU by 29% on this real-world mesh size.

References

  1. Van der Wijngaart, R. F., Haoqiang, J., "NASA Parallel Benchmarks, Multi-Zone Versions," NAS Technical Report NAS-03-010, July, 2003.Google ScholarGoogle Scholar
  2. Buning, P., Parks, S., Chan, W., and Renze, K., "Application of the Chimera Overlapped Grid Scheme to Simulation of Space Shuttle Ascent Flows," Proceedings of the 4th International Symposium on Computational Fluid Dynamics, Vol. 1, 1991, pp. 132--137.Google ScholarGoogle Scholar
  3. Visbal, M. and Gaitonde, D., "On the Use of Higher-Order Finite-Difference Schemes on Curvilinear and Deforming Meshes," J. of Computational Physics, Vol. 181(1), pp. 155--185, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Xu, R., Tian, X., Chandrasekaran, S., Yan, Y., and Chapman, B., "OpenACC Parallelization and optimization of NAS parallel benchmarks," GPU Technology Conference 2014.Google ScholarGoogle Scholar
  5. www.openacc.org, accessed on July 28, 2015.Google ScholarGoogle Scholar
  6. www.nvidia.com/object/cuda_home_new.html, accessed on July 27, 2015.Google ScholarGoogle Scholar
  7. Y. Zhang, J. Cohen, J. D. Owens, "Fast tridiagonal solvers on the GPU," ACM Sigplan Notices, 45 (2010) 127--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. P. Stone, E. P. Duque, Y. Zhang, D. Car, J. D. Owens, R. L. Davis, "GPGPU parallel algorithms for structured-grid CFD codes," AIAA paper, 2011-3221, 2011.Google ScholarGoogle Scholar

Index Terms

  1. Accelerating the multi-zone scalar pentadiagonal CFD algorithm with OpenACC

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WACCPD '15: Proceedings of the Second Workshop on Accelerator Programming using Directives
        November 2015
        68 pages
        ISBN:9781450340144
        DOI:10.1145/2832105

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 November 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        WACCPD '15 Paper Acceptance Rate7of14submissions,50%Overall Acceptance Rate7of14submissions,50%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader