skip to main content
research-article

Algorithms and data structures for massively parallel generic adaptive finite element codes

Published: 05 January 2012 Publication History

Abstract

Today's largest supercomputers have 100,000s of processor cores and offer the potential to solve partial differential equations discretized by billions of unknowns. However, the complexity of scaling to such large machines and problem sizes has so far prevented the emergence of generic software libraries that support such computations, although these would lower the threshold of entry and enable many more applications to benefit from large-scale computing.
We are concerned with providing this functionality for mesh-adaptive finite element computations. We assume the existence of an “oracle” that implements the generation and modification of an adaptive mesh distributed across many processors, and that responds to queries about its structure. Based on querying the oracle, we develop scalable algorithms and data structures for generic finite element methods. Specifically, we consider the parallel distribution of mesh data, global enumeration of degrees of freedom, constraints, and postprocessing. Our algorithms remove the bottlenecks that typically limit large-scale adaptive finite element analyses.
We demonstrate scalability of complete finite element workflows on up to 16,384 processors. An implementation of the proposed algorithms, based on the open source software p4est as mesh oracle, is provided under an open source license through the widely used deal.II finite element software library.

References

[1]
Ainsworth, M. and Oden, J. T. 2000. A Posteriori Error Estimation in Finite Element Analysis. Wiley.
[2]
Balay, S., Buschelman, K., Eijkhout, V., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Smith, B. F., and Zhang, H. 2008. PETSc users manual. Tech. rep. ANL-95/11 - Revision 3.0.0, Argonne National Laboratory.
[3]
Balay, S., Buschelman, K., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Smith, B. F., and Zhang, H. 2010. PETSc Web page. http://www.mcs.anl.gov/petsc.
[4]
Bangerth, W., Hartmann, R., and Kanschat, G. 2007. deal.II -- a general purpose object oriented finite element library. ACM Trans. Math. Softw. 33, 4, 24/1--24/27.
[5]
Bangerth, W. and Kanschat, G. 2011. deal.II. Differential Equations Analysis Library, Tech. ref. http://www.dealii.org/.
[6]
Bangerth, W. and Kayser-Herold, O. 2009. Data structures and requirements for hp finite element software. ACM Trans. Math. Softw. 36, 1, 4/1--4/31.
[7]
Bangerth, W. and Rannacher, R. 2003. Adaptive Finite Element Methods for Differential Equations. Birkhäuser Verlag.
[8]
Bank, R. E. 1998. PLTMG: A software package for solving elliptic partial differential equations. Users' guide 8.0. SIAM, Philadelphia.
[9]
Bruaset, A. M. and Langtangen, H. P. 1997. A comprehensive set of tools for solving partial differential equations; DiffPack. In M. Dæhlen and A. Tveito Eds., Numerical Methods and Software Tools in Industrial Mathematics. Birkhäuser, Boston, 61--90.
[10]
Burri, A., Dedner, A., Klöfkorn, R., and Ohlberger, M. 2005. An efficient implementation of an adaptive and parallel grid in DUNE. In Proceedings of the 2nd Russian-German Advanced Research Workshop. Springer, 67--82.
[11]
Burstedde, C., Ghattas, O., Gurnis, M., Isaac, T., Stadler, G., Warburton, T., and Wilcox, L. C. 2010. Extreme-scale AMR. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. ACM/IEEE.
[12]
Burstedde, C., Ghattas, O., Gurnis, M., Tan, E., Tu, T., Stadler, G., Wilcox, L. C., and Zhong, S. 2008a. Scalable adaptive mantle convection simulation on petascale supercomputers. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. ACM/IEEE.
[13]
Burstedde, C., Ghattas, O., Stadler, G., Tu, T., and Wilcox, L. C. 2008b. Towards adaptive mesh PDE simulations on petascale computers. In Proceedings of Teragrid.
[14]
Burstedde, C., Wilcox, L. C., and Ghattas, O. 2011b. p4est: Scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM J. Sc. Comput. 33, 3, 1103--1133.
[15]
Carey, G. F. 1997. Computational Grids: Generation, Adaptation and Solution Strategies. Taylor & Francis.
[16]
Carrington, L., Komatitsch, D., Laurenzano, M., Tikir, M. M., Michéa, D., Goff, N. L., Snavely, A., and Tromp, J. 2008. High-frequency simulations of global seismic wave propagation using SPECFEM3D GLOBE on 62K processors. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. ACM/IEEE.
[17]
Chand, K. K., Diachin, L. F., Li, X., Ollivier-Gooch, C., Seol, E. S., Shephard, M. S., Tautges, T., and Trease, H. 2008. Toward interoperable mesh, geometry and field components for PDE simulation development. Eng. Comput. 24, 165--182.
[18]
Devine, K. D., Flaherty, J. E., Wheat, S. R., and Maccabe, A. B. 1993. A massively parallel adaptive finite element method with dynamic load balancing. In Proceedings of the ACM/IEEE Conference on Supercomputing. ACM. 2--11.
[19]
Falgout, R. D., Jones, J. E., and Yang, U. M. 2005. Pursuing scalability for hypre's conceptual interfaces. ACM Trans. Math. Softw. 31, 326--350.
[20]
Falgout, R. D., Jones, J. E., and Yang, U. M. 2006. The design and implementation of hypre, a library of parallel high performance preconditioners. In T. J. Barth, M. Griebel, D. E. Keyes, R. M. Nieminen, D. Roose, T. Schlick, A. M. Bruaset, and A. Tveito Eds., vol. 51, Numerical Solution of Partial Differential Equations on Parallel Computers, Lecture Notes in Computational Science and Engineering. Springer, 267--294.
[21]
Gee, M. W., Siefert, C. M., Hu, J. J., Tuminaro, R. S., and Sala, M. G. 2006. ML 5.0 smoothed aggregation user's guide. Tech. rep. SAND2006-2649, Sandia National Laboratories.
[22]
Geenen, T., ur Rehman, M., MacLachlan, S. P., Segal, G., Vuik, C., van den Berg, A. P., and Spakman, W. 2009. Scalable robust solvers for unstructured fe geodynamic modeling applications: Solving the Stokes equation for models with large localized viscosity contrasts. Geoch. Geoph. Geosyst. 10, Q09002/1--12.
[23]
Henning, J. L. 2007. Performance counters and development of spec cpu2006. ACM SIGARCH Computer Architecture News 35, 118--121.
[24]
Heroux, M. A. et al. 2011. Trilinos web page. http://trilinos.sandia.gov.
[25]
Heroux, M. A., Bartlett, R. A., Howle, V. E., Hoekstra, R. J., Hu, J. J., Kolda, T. G., Lehoucq, R. B., Long, K. R., Pawlowski, R. P., Phipps, E. T., Salinger, A. G., Thornquist, H. K., Tuminaro, R. S., Willenbring, J. M., Williams, A., and Stanley, K. S. 2005. An overview of the Trilinos project. ACM Trans. Math. Softw. 31, 397--423.
[26]
Kanschat, G. and Rivière, B. 2010. A strongly conservative finite element method for the coupling of Stokes and Darcy flow. J. Comp. Phys. 229, 5933--5943.
[27]
Kirk, B., Peterson, J. W., Stogner, R. H., and Carey, G. F. 2006. libMesh: A C++ Library for Parallel Adaptive Mesh Refinement/Coarsening Simulations. Eng. Comput. 22, 3--4, 237--254.
[28]
Knepley, M. G. and Karpeev, D. A. 2009. Mesh algorithms for PDE with Sieve I: Mesh distribution. Sci. Prog. 17, 215--230.
[29]
Langtangen, H. P. 2003. Computational Partial Differential Equations: Numerical Methods and Diffpack Programming. Texts in Computational Science and Engineering. Springer Verlag.
[30]
Logg, A. 2007. Automating the finite element method. Arch. Comput. Meth. Eng. 14, 2, 93--138.
[31]
Logg, A. 2009. Efficient representation of computational meshes. Int. J. Comput. Sc. Engrg. 4, 4, 283--295.
[32]
Logg, A. and Wells, G. 2010. DOLFIN: Automated finite element computing. ACM Trans. Math. Softw. 37, 2, 1--28.
[33]
Mathur, K. K., Johan, Z., Johnsson, S. L., and Hughes, T. J. R. 1993. Massively parallel computing: Unstructures finite element simulations. Tech. rep. TR-08-93, Center for Research in Computing Technology, Harvard University.
[34]
McKenzie, D. P., Roberts, J. M., and Weiss, N. O. 1974. Convection in the Earth's mantle: Towards a numerical solution. J. Fluid Mech. 62, 465--538.
[35]
Message Passing Interface Forum 2009. MPI: A message-passing interface standard (version 2.2). Tech. rep., http://www.mpi-forum.org/.
[36]
Morton, G. M. 1966. A computer oriented geodetic data base; and a new technique in file sequencing. Tech. rep. IBM Ltd.
[37]
Ollivier-Gooch, C., Diachin, L., Shephard, M. S., Tautges, T., Kraftcheck, J., Leung, V., Luo, X., and Miller, M. 2010. An interoperable, data-structure-neutral component for mesh query and manipulation. ACM Trans. Math. Softw. 37, 29/1--28.
[38]
Patzák, B. and Bittnar, Z. 2001. Design of object oriented finite element code. Adv. Eng. Software 32, 10--11, 759--767.
[39]
Reinders, J. 2007. Intel Threading Building Blocks. O'Reilly.
[40]
Renard, Y. and Pommier, J. 2006. Getfem++. Technical rep., INSA Toulouse. http://www-gmm.insa-toulouse.fr/getfem/.
[41]
Rheinboldt, W. C. and Mesztenyi, C. K. 1980. On a data structure for adaptive finite element mesh refinements. ACM Trans. Math. Softw. 6, 166--187.
[42]
Schubert, G., Turcotte, D. L., and Olson, P. 2001. Mantle Convection in the Earth and Planets, Part 1. Cambridge University Press.
[43]
Silvester, D. and Wathen, A. 1994. Fast iterative solution of stabilised Stokes systems. Part II: Using general block preconditioners. SIAM J. Numer. Anal. 31, 1352--1367.
[44]
Šolín, P., Červený, J., and Doležel, I. 2008. Arbitrary-level hanging nodes and automatic adaptivity in the hp-FEM. Math. Comput. Sim. 77, 117--132.
[45]
Šolín, P., Segeth, K., and Doležel, I. 2003. Higher-Order Finite Element Methods. Chapman & Hall/CRC.
[46]
Stroustrup, B. 1997. The C++ Programming Language 3rd Ed. Addison-Wesley.
[47]
Sundar, H., Sampath, R., and Biros, G. 2008. Bottom-up construction and 2:1 balance refinement of linear octrees in parallel. SIAM J. Sci. Comput. 30, 5, 2675--2708.
[48]
Tan, E., Gurnis, M., Armendariz, L., Strand, L., and Kientz, S. 2008. Citcoms user manual version 3.0.1.
[49]
Tezduyar, T. E., Aliabadi, S. K., Behr, M., and Mittal, S. 1994. Massively parallel finite element simulation of compressible and incompressible flows. Comp. Meth. Appl. Mech. Eng. 119, 157--177.
[50]
Tikhonova, A., Tanase, G., Tkachyshyn, O., Amato, N. M., and Rauchwerger, L. 2005. Parallel algorithms in STAPL: Sorting and the selection problem. Tech. rep. TR05-005, Parasol Lab, Department of Computer Science, Texas A&M University.
[51]
Tu, T., O'Hallaron, D. R., and Ghattas, O. 2005. Scalable parallel octree meshing for terascale applications. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. ACM/IEEE.
[52]
Verfürth, R. 1994. A posteriori error estimation and adaptive mesh-refinement techniques. J. Comput. Appl. Math. 50, 67--83.

Cited By

View all
  • (2025)Adaptive mesh refinement algorithm for CESE schemes on unstructured quadrilateral meshesComputer Physics Communications10.1016/j.cpc.2025.109565311(109565)Online publication date: Jun-2025
  • (2024)The deal.II library, Version 9.6Journal of Numerical Mathematics10.1515/jnma-2024-013732:4(369-380)Online publication date: 26-Nov-2024
  • (2024)High-order Finite Element Forward Modeling Algorithm for Three-dimensional Borehole-to-surface Electromagnetic Method using Octree MeshesGEOPHYSICS10.1190/geo2023-0365.1(1-65)Online publication date: 1-Apr-2024
  • Show More Cited By

Index Terms

  1. Algorithms and data structures for massively parallel generic adaptive finite element codes

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Mathematical Software
      ACM Transactions on Mathematical Software  Volume 38, Issue 2
      December 2011
      136 pages
      ISSN:0098-3500
      EISSN:1557-7295
      DOI:10.1145/2049673
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 05 January 2012
      Accepted: 01 June 2011
      Revised: 01 January 2011
      Received: 01 September 2010
      Published in TOMS Volume 38, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Adaptive mesh refinement
      2. object-orientation
      3. parallel algorithms
      4. software design

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)77
      • Downloads (Last 6 weeks)9
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Adaptive mesh refinement algorithm for CESE schemes on unstructured quadrilateral meshesComputer Physics Communications10.1016/j.cpc.2025.109565311(109565)Online publication date: Jun-2025
      • (2024)The deal.II library, Version 9.6Journal of Numerical Mathematics10.1515/jnma-2024-013732:4(369-380)Online publication date: 26-Nov-2024
      • (2024)High-order Finite Element Forward Modeling Algorithm for Three-dimensional Borehole-to-surface Electromagnetic Method using Octree MeshesGEOPHYSICS10.1190/geo2023-0365.1(1-65)Online publication date: 1-Apr-2024
      • (2024)Cache-optimized and low-overhead implementations of additive Schwarz methods for high-order FEM multigrid computationsInternational Journal of High Performance Computing Applications10.1177/1094342023121722138:3(192-209)Online publication date: 1-May-2024
      • (2024)Alternative Quadrant Representations with Morton Index and AVX2 Vectorization for AMR Algorithms within the p4est Software Library2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00071(301-310)Online publication date: 27-May-2024
      • (2024)A formalization of parallel data exchange algorithms used by numerical methods for solving partial differential equationsTheoretical Computer Science10.1016/j.tcs.2024.114912(114912)Online publication date: Oct-2024
      • (2024)On the construction of an efficient finite-element solver for phase-field simulations of many-particle solid-state-sintering processesComputational Materials Science10.1016/j.commatsci.2023.112589231(112589)Online publication date: Jan-2024
      • (2024)Parallel assembly of finite element matrices on multicore computersComputer Methods in Applied Mechanics and Engineering10.1016/j.cma.2024.117076428(117076)Online publication date: Aug-2024
      • (2024)A highly efficient computational approach for fast scan-resolved simulations of metal additive manufacturing processes on the scale of real partsAdditive Manufacturing10.1016/j.addma.2023.10392179(103921)Online publication date: Jan-2024
      • (2024)Parallel algorithm design and optimization of geodynamic numerical simulation application on the Tianhe new-generation high-performance computerThe Journal of Supercomputing10.1007/s11227-023-05469-980:1(331-362)Online publication date: 1-Jan-2024
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media