research-article

Algorithms and data structures for massively parallel generic adaptive finite element codes

Authors:

Wolfgang Bangerth,

Carsten Burstedde,

Martin KronbichlerAuthors Info & Claims

ACM Transactions on Mathematical Software (TOMS), Volume 38, Issue 2

Article No.: 14, Pages 1 - 28

https://doi.org/10.1145/2049673.2049678

Published: 05 January 2012 Publication History

Abstract

Today's largest supercomputers have 100,000s of processor cores and offer the potential to solve partial differential equations discretized by billions of unknowns. However, the complexity of scaling to such large machines and problem sizes has so far prevented the emergence of generic software libraries that support such computations, although these would lower the threshold of entry and enable many more applications to benefit from large-scale computing.

We are concerned with providing this functionality for mesh-adaptive finite element computations. We assume the existence of an “oracle” that implements the generation and modification of an adaptive mesh distributed across many processors, and that responds to queries about its structure. Based on querying the oracle, we develop scalable algorithms and data structures for generic finite element methods. Specifically, we consider the parallel distribution of mesh data, global enumeration of degrees of freedom, constraints, and postprocessing. Our algorithms remove the bottlenecks that typically limit large-scale adaptive finite element analyses.

We demonstrate scalability of complete finite element workflows on up to 16,384 processors. An implementation of the proposed algorithms, based on the open source software p4est as mesh oracle, is provided under an open source license through the widely used deal.II finite element software library.

References

[1]

Ainsworth, M. and Oden, J. T. 2000. A Posteriori Error Estimation in Finite Element Analysis. Wiley.

[2]

Balay, S., Buschelman, K., Eijkhout, V., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Smith, B. F., and Zhang, H. 2008. PETSc users manual. Tech. rep. ANL-95/11 - Revision 3.0.0, Argonne National Laboratory.

[3]

Balay, S., Buschelman, K., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Smith, B. F., and Zhang, H. 2010. PETSc Web page. http://www.mcs.anl.gov/petsc.

[4]

Bangerth, W., Hartmann, R., and Kanschat, G. 2007. deal.II -- a general purpose object oriented finite element library. ACM Trans. Math. Softw. 33, 4, 24/1--24/27.

Digital Library

[5]

Bangerth, W. and Kanschat, G. 2011. deal.II. Differential Equations Analysis Library, Tech. ref. http://www.dealii.org/.

[6]

Bangerth, W. and Kayser-Herold, O. 2009. Data structures and requirements for hp finite element software. ACM Trans. Math. Softw. 36, 1, 4/1--4/31.

Digital Library

[7]

Bangerth, W. and Rannacher, R. 2003. Adaptive Finite Element Methods for Differential Equations. Birkhäuser Verlag.

[8]

Bank, R. E. 1998. PLTMG: A software package for solving elliptic partial differential equations. Users' guide 8.0. SIAM, Philadelphia.

[9]

Bruaset, A. M. and Langtangen, H. P. 1997. A comprehensive set of tools for solving partial differential equations; DiffPack. In M. Dæhlen and A. Tveito Eds., Numerical Methods and Software Tools in Industrial Mathematics. Birkhäuser, Boston, 61--90.

[10]

Burri, A., Dedner, A., Klöfkorn, R., and Ohlberger, M. 2005. An efficient implementation of an adaptive and parallel grid in DUNE. In Proceedings of the 2nd Russian-German Advanced Research Workshop. Springer, 67--82.

[11]

Burstedde, C., Ghattas, O., Gurnis, M., Isaac, T., Stadler, G., Warburton, T., and Wilcox, L. C. 2010. Extreme-scale AMR. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. ACM/IEEE.

Digital Library

[12]

Burstedde, C., Ghattas, O., Gurnis, M., Tan, E., Tu, T., Stadler, G., Wilcox, L. C., and Zhong, S. 2008a. Scalable adaptive mantle convection simulation on petascale supercomputers. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. ACM/IEEE.

Digital Library

[13]

Burstedde, C., Ghattas, O., Stadler, G., Tu, T., and Wilcox, L. C. 2008b. Towards adaptive mesh PDE simulations on petascale computers. In Proceedings of Teragrid.

[14]

Burstedde, C., Wilcox, L. C., and Ghattas, O. 2011b. p4est: Scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM J. Sc. Comput. 33, 3, 1103--1133.

Digital Library

[15]

Carey, G. F. 1997. Computational Grids: Generation, Adaptation and Solution Strategies. Taylor & Francis.

[16]

Carrington, L., Komatitsch, D., Laurenzano, M., Tikir, M. M., Michéa, D., Goff, N. L., Snavely, A., and Tromp, J. 2008. High-frequency simulations of global seismic wave propagation using SPECFEM3D GLOBE on 62K processors. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. ACM/IEEE.

Digital Library

[17]

Chand, K. K., Diachin, L. F., Li, X., Ollivier-Gooch, C., Seol, E. S., Shephard, M. S., Tautges, T., and Trease, H. 2008. Toward interoperable mesh, geometry and field components for PDE simulation development. Eng. Comput. 24, 165--182.

Digital Library

[18]

Devine, K. D., Flaherty, J. E., Wheat, S. R., and Maccabe, A. B. 1993. A massively parallel adaptive finite element method with dynamic load balancing. In Proceedings of the ACM/IEEE Conference on Supercomputing. ACM. 2--11.

Digital Library

[19]

Falgout, R. D., Jones, J. E., and Yang, U. M. 2005. Pursuing scalability for hypre's conceptual interfaces. ACM Trans. Math. Softw. 31, 326--350.

Digital Library

[20]

Falgout, R. D., Jones, J. E., and Yang, U. M. 2006. The design and implementation of hypre, a library of parallel high performance preconditioners. In T. J. Barth, M. Griebel, D. E. Keyes, R. M. Nieminen, D. Roose, T. Schlick, A. M. Bruaset, and A. Tveito Eds., vol. 51, Numerical Solution of Partial Differential Equations on Parallel Computers, Lecture Notes in Computational Science and Engineering. Springer, 267--294.

[21]

Gee, M. W., Siefert, C. M., Hu, J. J., Tuminaro, R. S., and Sala, M. G. 2006. ML 5.0 smoothed aggregation user's guide. Tech. rep. SAND2006-2649, Sandia National Laboratories.

[22]

Geenen, T., ur Rehman, M., MacLachlan, S. P., Segal, G., Vuik, C., van den Berg, A. P., and Spakman, W. 2009. Scalable robust solvers for unstructured fe geodynamic modeling applications: Solving the Stokes equation for models with large localized viscosity contrasts. Geoch. Geoph. Geosyst. 10, Q09002/1--12.

[23]

Henning, J. L. 2007. Performance counters and development of spec cpu2006. ACM SIGARCH Computer Architecture News 35, 118--121.

Digital Library

[24]

Heroux, M. A. et al. 2011. Trilinos web page. http://trilinos.sandia.gov.

[25]

Heroux, M. A., Bartlett, R. A., Howle, V. E., Hoekstra, R. J., Hu, J. J., Kolda, T. G., Lehoucq, R. B., Long, K. R., Pawlowski, R. P., Phipps, E. T., Salinger, A. G., Thornquist, H. K., Tuminaro, R. S., Willenbring, J. M., Williams, A., and Stanley, K. S. 2005. An overview of the Trilinos project. ACM Trans. Math. Softw. 31, 397--423.

Digital Library

[26]

Kanschat, G. and Rivière, B. 2010. A strongly conservative finite element method for the coupling of Stokes and Darcy flow. J. Comp. Phys. 229, 5933--5943.

Digital Library

[27]

Kirk, B., Peterson, J. W., Stogner, R. H., and Carey, G. F. 2006. libMesh: A C++ Library for Parallel Adaptive Mesh Refinement/Coarsening Simulations. Eng. Comput. 22, 3--4, 237--254.

Digital Library

[28]

Knepley, M. G. and Karpeev, D. A. 2009. Mesh algorithms for PDE with Sieve I: Mesh distribution. Sci. Prog. 17, 215--230.

Digital Library

[29]

Langtangen, H. P. 2003. Computational Partial Differential Equations: Numerical Methods and Diffpack Programming. Texts in Computational Science and Engineering. Springer Verlag.

Digital Library

[30]

Logg, A. 2007. Automating the finite element method. Arch. Comput. Meth. Eng. 14, 2, 93--138.

[31]

Logg, A. 2009. Efficient representation of computational meshes. Int. J. Comput. Sc. Engrg. 4, 4, 283--295.

Digital Library

[32]

Logg, A. and Wells, G. 2010. DOLFIN: Automated finite element computing. ACM Trans. Math. Softw. 37, 2, 1--28.

Digital Library

[33]

Mathur, K. K., Johan, Z., Johnsson, S. L., and Hughes, T. J. R. 1993. Massively parallel computing: Unstructures finite element simulations. Tech. rep. TR-08-93, Center for Research in Computing Technology, Harvard University.

[34]

McKenzie, D. P., Roberts, J. M., and Weiss, N. O. 1974. Convection in the Earth's mantle: Towards a numerical solution. J. Fluid Mech. 62, 465--538.

[35]

Message Passing Interface Forum 2009. MPI: A message-passing interface standard (version 2.2). Tech. rep., http://www.mpi-forum.org/.

[36]

Morton, G. M. 1966. A computer oriented geodetic data base; and a new technique in file sequencing. Tech. rep. IBM Ltd.

[37]

Ollivier-Gooch, C., Diachin, L., Shephard, M. S., Tautges, T., Kraftcheck, J., Leung, V., Luo, X., and Miller, M. 2010. An interoperable, data-structure-neutral component for mesh query and manipulation. ACM Trans. Math. Softw. 37, 29/1--28.

Digital Library

[38]

Patzák, B. and Bittnar, Z. 2001. Design of object oriented finite element code. Adv. Eng. Software 32, 10--11, 759--767.

Digital Library

[39]

Reinders, J. 2007. Intel Threading Building Blocks. O'Reilly.

Digital Library

[40]

Renard, Y. and Pommier, J. 2006. Getfem++. Technical rep., INSA Toulouse. http://www-gmm.insa-toulouse.fr/getfem/.

[41]

Rheinboldt, W. C. and Mesztenyi, C. K. 1980. On a data structure for adaptive finite element mesh refinements. ACM Trans. Math. Softw. 6, 166--187.

Digital Library

[42]

Schubert, G., Turcotte, D. L., and Olson, P. 2001. Mantle Convection in the Earth and Planets, Part 1. Cambridge University Press.

[43]

Silvester, D. and Wathen, A. 1994. Fast iterative solution of stabilised Stokes systems. Part II: Using general block preconditioners. SIAM J. Numer. Anal. 31, 1352--1367.

Digital Library

[44]

Šolín, P., Červený, J., and Doležel, I. 2008. Arbitrary-level hanging nodes and automatic adaptivity in the hp-FEM. Math. Comput. Sim. 77, 117--132.

Digital Library

[45]

Šolín, P., Segeth, K., and Doležel, I. 2003. Higher-Order Finite Element Methods. Chapman & Hall/CRC.

[46]

Stroustrup, B. 1997. The C++ Programming Language 3rd Ed. Addison-Wesley.

Digital Library

[47]

Sundar, H., Sampath, R., and Biros, G. 2008. Bottom-up construction and 2:1 balance refinement of linear octrees in parallel. SIAM J. Sci. Comput. 30, 5, 2675--2708.

Digital Library

[48]

Tan, E., Gurnis, M., Armendariz, L., Strand, L., and Kientz, S. 2008. Citcoms user manual version 3.0.1.

[49]

Tezduyar, T. E., Aliabadi, S. K., Behr, M., and Mittal, S. 1994. Massively parallel finite element simulation of compressible and incompressible flows. Comp. Meth. Appl. Mech. Eng. 119, 157--177.

[50]

Tikhonova, A., Tanase, G., Tkachyshyn, O., Amato, N. M., and Rauchwerger, L. 2005. Parallel algorithms in STAPL: Sorting and the selection problem. Tech. rep. TR05-005, Parasol Lab, Department of Computer Science, Texas A&M University.

[51]

Tu, T., O'Hallaron, D. R., and Ghattas, O. 2005. Scalable parallel octree meshing for terascale applications. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. ACM/IEEE.

Digital Library

[52]

Verfürth, R. 1994. A posteriori error estimation and adaptive mesh-refinement techniques. J. Comput. Appl. Math. 50, 67--83.

Digital Library

Cited By

Shi LZhang CWen C(2025)Adaptive mesh refinement algorithm for CESE schemes on unstructured quadrilateral meshesComputer Physics Communications10.1016/j.cpc.2025.109565311(109565)Online publication date: Jun-2025
https://doi.org/10.1016/j.cpc.2025.109565
Africa PArndt DBangerth WBlais BFehling MGassmöller RHeister THeltai LKinnewig SKronbichler MMaier MMunch PSchreter-Fleischhacker MThiele JTurcksin BWells DYushutin V(2024)The deal.II library, Version 9.6Journal of Numerical Mathematics10.1515/jnma-2024-013732:4(369-380)Online publication date: 26-Nov-2024
https://doi.org/10.1515/jnma-2024-0137
Qin CLi HLi HZhao N(2024)High-order Finite Element Forward Modeling Algorithm for Three-dimensional Borehole-to-surface Electromagnetic Method using Octree MeshesGEOPHYSICS10.1190/geo2023-0365.1(1-65)Online publication date: 1-Apr-2024
https://doi.org/10.1190/geo2023-0365.1
Show More Cited By

Index Terms

Algorithms and data structures for massively parallel generic adaptive finite element codes
1. Mathematics of computing
  1. Mathematical analysis
    1. Numerical analysis
      1. Computations in finite fields
  2. Mathematical software

Recommendations

Data structures and requirements for hp finite element software

Finite element methods approximate solutions of partial differential equations by restricting the problem to a finite dimensional function space. In hp adaptive finite element methods, one defines these discrete spaces by choosing different polynomial ...
Massively parallel finite element programming
EuroMPI'10: Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface

Today's large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in ...
Parallel, fully automatic hp-adaptive 3D finite element package

The paper presents a description of par3Dhp--a 3D, parallel, fully automatic hp-adaptive finite element code for elliptic and Maxwell problems. The parallel implementation is an extension of the sequential code 3Dhp90, which generates, in a fully ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Mathematical Software

ACM Transactions on Mathematical Software Volume 38, Issue 2

December 2011

136 pages

ISSN:0098-3500

EISSN:1557-7295

DOI:10.1145/2049673

Issue’s Table of Contents

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2012

Accepted: 01 June 2011

Revised: 01 January 2011

Received: 01 September 2010

Published in TOMS Volume 38, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

143
Total Citations
View Citations
1,380
Total Downloads

Downloads (Last 12 months)77
Downloads (Last 6 weeks)9

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shi LZhang CWen C(2025)Adaptive mesh refinement algorithm for CESE schemes on unstructured quadrilateral meshesComputer Physics Communications10.1016/j.cpc.2025.109565311(109565)Online publication date: Jun-2025
https://doi.org/10.1016/j.cpc.2025.109565
Africa PArndt DBangerth WBlais BFehling MGassmöller RHeister THeltai LKinnewig SKronbichler MMaier MMunch PSchreter-Fleischhacker MThiele JTurcksin BWells DYushutin V(2024)The deal.II library, Version 9.6Journal of Numerical Mathematics10.1515/jnma-2024-013732:4(369-380)Online publication date: 26-Nov-2024
https://doi.org/10.1515/jnma-2024-0137
Qin CLi HLi HZhao N(2024)High-order Finite Element Forward Modeling Algorithm for Three-dimensional Borehole-to-surface Electromagnetic Method using Octree MeshesGEOPHYSICS10.1190/geo2023-0365.1(1-65)Online publication date: 1-Apr-2024
https://doi.org/10.1190/geo2023-0365.1
Munch PKronbichler M(2024)Cache-optimized and low-overhead implementations of additive Schwarz methods for high-order FEM multigrid computationsInternational Journal of High Performance Computing Applications10.1177/1094342023121722138:3(192-209)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1177/10943420231217221
Kirilin MBurstedde C(2024)Alternative Quadrant Representations with Morton Index and AVX2 Vectorization for AMR Algorithms within the p4est Software Library2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00071(301-310)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00071
Samaniego CHouzeaux G(2024)A formalization of parallel data exchange algorithms used by numerical methods for solving partial differential equationsTheoretical Computer Science10.1016/j.tcs.2024.114912(114912)Online publication date: Oct-2024
https://doi.org/10.1016/j.tcs.2024.114912
Munch PIvannikov VCyron CKronbichler M(2024)On the construction of an efficient finite-element solver for phase-field simulations of many-particle solid-state-sintering processesComputational Materials Science10.1016/j.commatsci.2023.112589231(112589)Online publication date: Jan-2024
https://doi.org/10.1016/j.commatsci.2023.112589
Krysl P(2024)Parallel assembly of finite element matrices on multicore computersComputer Methods in Applied Mechanics and Engineering10.1016/j.cma.2024.117076428(117076)Online publication date: Aug-2024
https://doi.org/10.1016/j.cma.2024.117076
Proell SMunch PKronbichler MWall WMeier C(2024)A highly efficient computational approach for fast scan-resolved simulations of metal additive manufacturing processes on the scale of real partsAdditive Manufacturing10.1016/j.addma.2023.10392179(103921)Online publication date: Jan-2024
https://doi.org/10.1016/j.addma.2023.103921
Yang JYang WQi RTsai QLin SDong FLi KLi K(2024)Parallel algorithm design and optimization of geodynamic numerical simulation application on the Tianhe new-generation high-performance computerThe Journal of Supercomputing10.1007/s11227-023-05469-980:1(331-362)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1007/s11227-023-05469-9
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents