research-article

Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors

Authors:
Martin Kronbichler

Technical University of Munich, Garching, Germany

Technical University of Munich, Garching, Germany
View Profile

,
Karl Ljungkvist

Uppsala University, Uppsala, Sweden

Uppsala University, Uppsala, Sweden
View Profile

Authors Info & Claims

ACM Transactions on Parallel Computing Volume 6 Issue 1Article No.: 2pp 1–32https://doi.org/10.1145/3322813

Published:07 May 2019Publication History

ACM Transactions on Parallel Computing

Abstract

This article presents matrix-free finite-element techniques for efficiently solving partial differential equations on modern many-core processors, such as graphics cards. We develop a GPU parallelization of a matrix-free geometric multigrid iterative solver targeting moderate and high polynomial degrees, with support for general curved and adaptively refined hexahedral meshes with hanging nodes. The central algorithmic component is the matrix-free operator evaluation with sum factorization. We compare the node-level performance of our implementation running on an Nvidia Pascal P100 GPU to a highly optimized multicore implementation running on comparable Intel Broadwell CPUs and an Intel Xeon Phi. Our experiments show that the GPU implementation is approximately 1.5 to 2 times faster across four different scenarios of the Poisson equation and a variety of element degrees in 2D and 3D. The lowest time to solution per degree of freedom is recorded for moderate polynomial degrees between 3 and 5. A detailed performance analysis highlights the capabilities of the GPU architecture and the chosen execution model with threading within the element, particularly with respect to the evaluation of the matrix-vector product. Atomic intrinsics are shown to provide a fast way for avoiding the possible race conditions in summing the elemental residuals into the global vector associated to shared vertices, edges, and surfaces. In addition, the solver infrastructure allows for using mixed-precision arithmetic that performs the multigrid V-cycle in single precision with an outer correction in double precision, increasing throughput by up to 83%.

References

Mark Adams, Marian Brezina, Jonathan Hu, and Ray Tuminaro. 2003. Parallel multigrid smoothing: Polynomial versus Gauss--Seidel. Journal of Computational Physics 188 (2003), 593--610. Google ScholarDigital Library
Mark Adams, Jed Brown, John Shalf, Brian Van Straalen, Erich Strohmaier, and Samuel Williams. 2019. High-performance geometric multigrid. Retrieved January 18, 2019 from https://hpgmg.org/.Google Scholar
Mark Adams, Phillip Colella, Daniel T. Graves, Jeff N. Johnson, Hans S. Johansen, Noel D. Keen, Terry J. Ligocki, et al. 2015. Chombo Software Package for AMR Applications Design Document. Technical Report. Lawrence Berkeley National Laboratory. https://crd.lbl.gov/assets/pubs_presos/chomboDesign.pdf.Google Scholar
Mark F. Adams, Ravi Samtaney, and Achi Brandt. 2010. Toward textbook multigrid efficiency for fully implicit resistive magnetohydrodynamics. Journal of Computational Physics 229, 18 (2010), 6208--6219. Google ScholarDigital Library
Giovanni Alzetta, Daniel Arndt, Wolfgang Bangerth, Vishal Boddu, Benjamin Brands, Denis Davydov, Rene Gassmöller, et al. 2018. The deal.II library, version 9.0. Journal of Numerical Mathematics 26, 4 (2018), 173--184.Google ScholarCross Ref
Wolfgang Bangerth, Carsten Burstedde, Timo Heister, and Martin Kronbichler. 2011. Algorithms and data structures for massively parallel generic finite element codes. ACM Transactions on Mathematical Software 38, 2 (2011), Article 14. Google ScholarDigital Library
Marsha J. Berger, Michael J. Aftosmis, and Gedas Adomavicius. 2001. Parallel multigrid on Cartesian meshes with complex geometry. In Parallel Computational Fluid Dynamics. North Holland, Amsterdam, 283--290.Google Scholar
Achi Brandt. 1977. Multi-level adaptive solutions to boundary-value problems. Mathematics of Computation 31 (1977), 333--390.Google ScholarCross Ref
Jed Brown. 2010. Efficient nonlinear solvers for nodal high-order finite elements in 3D. Journal of Scientific Computing 45, 1--3 (2010), 48--63. Google ScholarDigital Library
Jed Brown, Veselin Dobrev, Som Dutta, Paul Fischer, Kazem Kamran, Tzanio Kolev, David Medina, et al. 2018. ECP Milestone Report: Propose High-Order Mesh/Data Format, WBS 2.2.6.06, Milestone CEED-MS18. Technical Report. U.S. Department of Energy. https://ceed.exascaleproject.org/docs/ceed-ms18-report.pdf.Google Scholar
Jed Brown, Barry Smith, and Aron Ahmadia. 2013. Achieving textbook multigrid efficiency for hydrostatic ice sheet flow. SIAM Journal on Scientific Computing 35, 2 (2013), B359--B375.Google ScholarCross Ref
CEED. 2018. CEED Bake-Off Problems (Benchmarks). Retrieved April 8, 2019 from http://ceed.exascaleproject.org/bps/.Google Scholar
Justin Chang, Maurice S. Fabien, Matthew G. Knepley, and Richard T. Mills. 2018. Comparative study of finite element methods using the time-accuracy-size (TAS) spectrum analysis. SIAM Journal on Scientific Computing 40, 6 (2018), C779--C802.Google ScholarCross Ref
Thomas C. Clevenger, Timo Heister, Guido Kanschat, and Martin Kronbichler. 2019. A flexible, parallel, adaptive geometric multigrid method for FEM. arXiv:1904.03317 preprint.Google Scholar
Michel O. Deville, Paul F. Fischer, and Ernest H. Mund. 2002. High-Order Methods for Incompressible Fluid Flow. Vol. 9. Cambridge University Press.Google Scholar
Maurice S. Fabien, Matthew G. Knepley, Richard T. Mills, and Beatrice M. Rivière. 2019. Manycore parallel computing for a hybridizable discontinuous Galerkin nested multigrid method. SIAM Journal on Scientific Computing 41, 2 (2019), C73-C96.Google ScholarCross Ref
Niklas Fehn, Wolfgang A. Wall, and Martin Kronbichler. 2018. Efficiency of high-performance discontinuous Galerkin spectral element methods for under-resolved turbulent incompressible flows. International Journal for Numerical Methods in Fluids 88, 1 (2018), 32--54.Google ScholarCross Ref
Niklas Fehn, Wolfgang A. Wall, and Martin Kronbichler. 2019. A matrix-free high-order discontinuous Galerkin compressible Navier--Stokes solver: A performance comparison of compressible and incompressible formulations for turbulent incompressible flows. International Journal for Numerical Methods in Fluids 89, 3 (2019), 71--102.Google ScholarCross Ref
Chunsheng Feng, Shi Shu, Jinchao Xu, and Chen-Song Zhang. 2014. Numerical study of geometric multigrid methods on CPU-GPU heterogeneous computers. Advances in Applied Mathematics and Mechanics 6, 1 (2014), 1--23.Google ScholarCross Ref
Krzysztof J. Fidkowski, Todd A. Oliver, James Lu, and David L. Darmofal. 2005. p-multigrid solver of high-order discontinuous Galerkin discretizations of the compressible Navier--Stokes equations. Journal of Computational Physics 207, 1 (2005), 92--113. Google ScholarDigital Library
Paul F. Fischer, Thilina Rathnayake, Som Dutta, Veselin Dobrev, Tzanio Kolev, Jean-Sylvain Camier, Martin Kronbichler, et al. 2019. Running faster in HPC applications. In Preparation.Google Scholar
Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, and Stefan Turek. 2013. Towards a complete FEM-based simulation toolkit on GPUs: Unstructured grid finite element geometric multigrid solvers with strong smoothers based on sparse approximate inverses. Computers 8 Fluids 80 (2013), 327--332.Google Scholar
Amir Gholami, Dhairya Malhotra, Hari Sundar, and George Biros. 2016. FFT, FMM, or multigrid? A comparative study of state-of-the-art Poisson solvers for uniform and nonuniform grids in the unit cube. SIAM Journal on Scientific Computing 38, 3 (2016), C280--C306.Google ScholarDigital Library
Bjorn Gmeiner, Ulrich Rüde, Holger Stengel, Christian Waluga, and Barbara Wohlmuth. 2015. Performance and scalability of hierarchical hybrid multigrid solvers for stokes systems. SIAM Journal on Scientific Computing 37, 2 (2015), C143--C168.Google ScholarCross Ref
Dominik Göddeke, Robert Strzodka, and Stefan Turek. 2007. Performance and accuracy of hardware-oriented native-, emulated-and mixed-precision solvers in FEM simulations. International Journal of Parallel Emergent and Distributed Systems 22, 4 (2007), 221--256. Google ScholarDigital Library
William D. Gropp, Dinesh K. Kaushik, David E. Keyes, and Barry F. Smith. 1999. Towards realistic performance bounds for implicit CFD codes. In Proceedings of Parallel CFD’99.Google Scholar
William D. Gropp, Dinesh K. Kaushik, David E. Keyes, and Barry F. Smith. 2000. Performance modeling and tuning of an unstructured mesh CFD application. In SC’00: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing. 34. Google ScholarDigital Library
Mark Harris. 2007. Optimizing CUDA. In SC’07: High Performance Computing With CUDA. Nvidia.Google Scholar
Brian Helenbrook, Dimitri J. Mavriplis, and Harold L. Atkins. 2003. Analysis of p-multigrid for continuous and discontinuous finite element discretizations. In Proceedings of the 16th AIAA Computational Fluid Dynamics Conference.Google Scholar
Michael A. Heroux, Roscoe A. Bartlett, Vicki E. Howle, Robert J. Hoekstra, Jonathan J. Hu, Tamara G. Kolda, Richard B. Lehoucq, et al. 2005. An overview of the Trilinos project. ACM Transactions on Mathematical Software 31, 3 (2005), 397--423. Google ScholarDigital Library
Bärbel Janssen and Guido Kanschat. 2011. Adaptive multilevel methods with local smoothing for H1-and Hcurl-conforming high order finite element methods. SIAM Journal on Scientific Computing 33, 4 (2011), 2095--2114. Google ScholarDigital Library
James Jeffers, James Reinders, and Avinash Sodani. 2016. Intel Xeon Phi Processor High Performance Programming, Knights Landing Edition. Morgan Kaufmann, Cambridge, MA. Google ScholarDigital Library
Zhe Jia, Marco Maggioni, Benjamin Staiger, and Daniele P. Scarpazza. 2018. Dissecting the Volta GPU Architecture Through Microbenchmarking: GTC 2018. Retrieved April 8, 2019 from http://on-demand.gputechconf.com/gtc/2018/presentation/s8122-dissecting-the-volta-gpu-architecture-through-microbenchmarking.pdf.Google Scholar
Alison C. Jones and Peter K. Jimack. 2005. An adaptive multigrid tool for elliptic and parabolic systems. International Journal for Numerical Methods in Fluids 47 (2005), 1123--1128.Google ScholarCross Ref
Guido Kanschat. 2004. Multilevel methods for discontinuous Galerkin FEM on locally refined meshes. Computers 8 Structures 82, 28 (2004), 2437--2445.Google Scholar
George E. Karniadakis and Spencer J. Sherwin. 2013. Spectral/HP Element Methods for Computational Fluid Dynamics. Oxford University Press.Google Scholar
Donald W. Kelly, Jorge P. De Sousa Rodrigues Gago, Olgierd C. Zienkiewicz, and Ivo Babuška. 1983. A posteriori error analysis and adaptive processes in the finite element method: Part I—Error analysis. International Journal for Numerical Methods in Engineering 19 (1983), 1593--1619.Google ScholarCross Ref
Andreas Klöckner. 2014. Loo.py: Transformation-based code generation for GPUs and CPUs. In Proceedings of ARRAY’14: ACM SIGPLAN Workshop on Libraries, Languages, and Compilers for Array Programming. Google ScholarDigital Library
Andreas Klöckner, Tim Warburton, Jeffrey Bridge, and Jan S. Hesthaven. 2009. Nodal discontinuous Galerkin methods on graphics processors. Journal of Computational Physics 228, 21 (2009), 7863--7882. Google ScholarDigital Library
Matthew G. Knepley and Andy R. Terrel. 2013. Finite element integration on GPUs. ACM Transactions on Mathematical Software 39, 2 (Feb. 2013), Article 10, 13 pages. Google ScholarDigital Library
Dimitri Komatitsch, Gordon Erlebacher, Dominik Göddeke, and David Michéa. 2010. High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster. Journal of Computational Physics 229, 20 (2010), 7692--7714. Google ScholarDigital Library
David A. Kopriva. 2009. Implementing Spectral Methods for Partial Differential Equations: Algorithms for Scientists and Engineers. Springer. Google Scholar
Katharina Kormann and Martin Kronbichler. 2011. Parallel finite element operator application: Graph partitioning and coloring. In Proceedings of the 7th IEEE International Conference on eScience. 332--339. Google ScholarDigital Library
Moritz Kreutzer, Georg Hager, Gerhard Wellein, Holger Fehske, and Alan R. Bishop. 2014. A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM Journal on Scientific Computing 36, 5 (2014), C401--C423.Google ScholarDigital Library
Martin Kronbichler and Katharina Kormann. 2012. A generic interface for parallel cell-based finite element operator application. Computers 8 Fluids 63 (2012), 135--147.Google Scholar
Martin Kronbichler and Katharina Kormann. 2019. Fast matrix-free evaluation of discontinuous Galerkin finite element operators. ACM Transactions on Mathematical Software, in press (2019).Google Scholar
Martin Kronbichler, Katharina Kormann, Igor Pasichnyk, and Momme Allalen. 2017. Fast matrix-free discontinuous Galerkin kernels on modern computer architectures. In ISC High Performance. Lecture Notes in Computer Science, Vol. 10266. Springer, 237--255.Google Scholar
Martin Kronbichler and Wolfgang A. Wall. 2018. A performance comparison of continuous and discontinuous Galerkin methods with fast multigrid solvers. SIAM Journal on Scientific Computing 40, 5 (2018), A3423--A3448.Google ScholarDigital Library
Karl Ljungkvist. 2014. Matrix-free finite-element operator application on graphics processing units. In Euro-Par 2014: Parallel Processing Workshops. Lecture Notes in Computer Science, Vol. 8806. Springer, 450--461.Google Scholar
Karl Ljungkvist. 2017. Matrix-free finite-element computations on graphics processors with adaptively refined unstructured meshes. In HPC’17: Proceedings of the 25th High Performance Computing Symposium. Google ScholarDigital Library
James W. Lottes and Paul F. Fischer. 2005. Hybrid multigrid-Schwarz algorithms for the spectral element method. Journal of Scientific Computing 24 (2005), 613--646.Google ScholarCross Ref
Hong Luo, Joseph D. Baum, and Rainald Löhner. 2006. A p-multigrid discontinuous Galerkin method for the Euler equations on unstructured grids. Journal of Computational Physics 211 (2006), 767--783. Google ScholarDigital Library
Robert E. Lynch, John R. Rice, and Donald H. Thomas. 1964. Direct solution of partial difference equations by tensor product methods. Numerische Mathematik 6 (1964), 185--199. Google ScholarDigital Library
Dave A. May, Jed Brown, and Laetitia Le Pourhiet. 2014. pTatin3D: High-performance methods for long-term lithospheric dynamics. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’14). 274--284. Google ScholarDigital Library
Stephen McCormick and James Thomas. 1986. The fast adaptive composite grid (FAC) method for elliptic equations. Mathematics of Computation 46, 174 (1986), 439--456. Google ScholarDigital Library
Eike Müller, Xu Guo, Robert Scheichl, and Sinan Shi. 2013. Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs. Computing and Visualization in Science 16, 2 (2013), 41--58. Google ScholarDigital Library
Nvidia Corporation. 2013. CUDA cuSPARSE Library. Version 8.0. Nvidia.Google Scholar
Nvidia Corporation. 2016. GP100 Pascal Whitepaper. Technical Report. Version 1.1. Nvidia.Google Scholar
Nvidia Corporation. 2016. Kepler Tuning Guide. Nvidia.Google Scholar
Steven A. Orszag. 1980. Spectral methods for problems in complex geometries. Journal of Computational Physics 37, 1 (1980), 70--92.Google ScholarCross Ref
Anthony T. Patera. 1984. A spectral element method for fluid dynamics: Laminar flow in a channel expansion. Journal of Computational Physics 54, 3 (1984), 468--488.Google ScholarCross Ref
Yehuda Pinchover and Jacob Rubinstein. 2005. An Introduction to Partial Differential Equations. Cambridge University Press. Google ScholarDigital Library
Jean-Francois Remacle, Rajesh Gandham, and Timothy Warburton. 2016. GPU accelerated spectral finite elements on all-hex meshes. Journal of Computational Physics 324 (2016), 246--257. Google ScholarDigital Library
Johann Rudi, A. Cristiano I. Malossi, Tobin Isaac, Georg Stadler, Michael Gurnis, Peter W. J. Staar, Yves Ineichen, et al. 2015. An extreme-scale implicit solver for complex PDEs: Highly heterogeneous flow in Earth’s mantle. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’15). 5:1--5:11. Google ScholarDigital Library
Jörg Stiller. 2017. Nonuniformly weighted Schwarz smoothers for spectral element multigrid. Journal of Scientific Computing 72 (2017), 81--96. Google ScholarDigital Library
Hari Sundar, George Biros, Carsten Burstedde, Johann Rudi, Omar Ghattas, and Georg Stadler. 2012. Parallel geometric-algebraic multigrid on unstructured forests of octrees. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’12). Google ScholarDigital Library
Hari Sundar, Georg Stadler, and George Biros. 2015. Comparison of multigrid algorithms for high-order continuous finite element discretizations. Numerical Linear Algebra With Applications 22 (2015), 664--680.Google ScholarCross Ref
Kasia Świrydowicz, Noel Chalmers, Ali Karakus, and Timothy Warburton. 2019. Acceleration of tensor-product operations for high-order finite element methods. International Journal for High Performance Computing Applications. In Press.Google Scholar
James L. Thomas, Boris Diskin, and Achi Brandt. 2003. Textbook multigrid efficiency for fluid simulations. Annual Reviews of Fluid Mechanics 35 (2003), 317--340.Google ScholarCross Ref
Jan Treibig, Georg Hager, and Gerhard Wellein. 2010. LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. In Proceedings of PSTI’10: The 1st International Workshop on Parallel Software Tools and Tool Infrastructures. https://github.com/RRZE-HPC/likwid. Google ScholarDigital Library
Ulrich Trottenberg, Cornelius Oosterlee, and Anton Schüller. 2001. Multigrid. Elsevier Academic Press, London, UK. Google ScholarDigital Library
Henry M. Tufo and Paul F. Fischer. 1999. Terascale spectral element algorithms and implementations. In Proceedings of the 1999 ACM/IEEE Conference on Supercomputing. ACM, New York, NY. Google ScholarDigital Library
Bruno Turcksin, Martin Kronbichler, and Wolfgang Bangerth. 2016. WorkStream—A design pattern for multicore-enabled finite element computations. ACM Transactions on Mathematical Software 43, 1 (2016), Article 2, 29 pages. Google ScholarDigital Library
Vasily Volkov. 2010. Better performance at lower occupancy. In Proceedings of the GPU Technology Conference (GTC’10), Vol. 10. 16.Google Scholar
Nicholas Wilt. 2013. The CUDA Handbook: A Comprehensive Guide to GPU Programming. Pearson Education.Google Scholar

Index Terms

Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors
1. Mathematics of computing
  1. Mathematical analysis
    1. Differential equations
      1. Partial differential equations
  2. Mathematical software
    1. Mathematical software performance

Recommendations

Optimization of geometric multigrid for emerging multi- and manycore processors
SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Multigrid methods are widely used to accelerate the convergence of iterative solvers for linear systems used in a number of different application areas. In this paper, we explore optimization techniques for geometric multigrid on existing and emerging ...
Read More
A performance study of general-purpose applications on graphics processors using CUDA

Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly programmable, offering the potential for dramatic speedups for a variety of ...
Read More
GPU-based matrix-free finite element solver exploiting symmetry of elemental matrices
Abstract
Matrix-free solvers for finite element method (FEM) avoid assembly of elemental matrices and replace sparse matrix-vector multiplication required in iterative solution method by an element level dense matrix-vector product. In this paper, a novel ...
Read More

Reviews

Reviewer: Amos O Olagunju

Discretization is a method for transforming continuous variables, equations, functions, and models into their discrete equivalents. A multigrid technique uses a hierarchy of discretization to solve elliptic partial differential equations (PDEs). Computing numerical solutions to the Poisson equation on a cube geometry or a nonlinear minimal surface equation requires fast computers and algorithms. So, how should efficient algorithms be designed for a matrix-free geometric multigrid solver on graphics processing units (GPUs) Kronbichler and Ljungkvist concisely review the proposed implementations of matrix-free finite element and multigrid algorithms on GPUs in the literature. Matrix-free finite element algorithms on GPUs have been used to solve specific "time stepping of hyperbolic problems," for example, seismic wave propagation and industrial generic assembly tools. Capitalizing on the well-established algorithms for computing the matrix-free higher-order finite elements on GPUs in the literature, the authors offer new insights into, and alternative algorithms for, the sum-factorization methods for solving the many matrix-free set up problems of alternative multigrid numerical problems. The authors offer unique research contributions to the design and implementation of effective algorithms for Laplace operator evaluation, multigrid transfer operations, and vector operations, as well as for handling constraints. The authors perform several experiments to investigate the effectiveness of multigrid computational ideas on GPUs for solving numerical problems. The experimental results examine impacts related to hardware, setup costs, and the nature of the required numerical solutions for solving alternative multigrid numerical problems. Compared to the existing (and reputable) matrix-free higher-order algorithms, results for the proposed algorithms appear promising, such as the higher-order polynomials on hexahedral meshes. The authors also clearly raise the unresolved issue of efficiency: "Thus, application codes relying on complex solvers need to put focus not solely on the matrix-vector product but also on the whole solver stack including the vector operations, where it is beneficial to optimize for data access." I strongly encourage all numerical analysts to read the paper's insightful ideas and recommend alternative solutions.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Parallel Computing Volume 6, Issue 1
March 2019
118 pages
ISSN:2329-4949
EISSN:2329-4957
DOI:10.1145/3331062
Editor:
David Bader
Georgia Institute of Technology, USA
Issue’s Table of Contents
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 May 2019
- Revised: 1 February 2019
- Accepted: 1 February 2019
- Received: 1 January 2018
Published in topc Volume 6, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CUDA
Finite element method
geometric multigrid
matrix-free method
sum factorization
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 42
  Total Citations
  View Citations
- 675
  Total Downloads
- Downloads (Last 12 months)107
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors

ACM Transactions on Parallel Computing

Abstract

References

Cited By

Index Terms

Recommendations

Optimization of geometric multigrid for emerging multi- and manycore processors

A performance study of general-purpose applications on graphics processors using CUDA

GPU-based matrix-free finite element solver exploiting symmetry of elemental matrices

Reviews

Access critical reviews of Computing literature here