skip to main content
article

Algorithm 842: A set of GMRES routines for real and complex arithmetics on high performance computers

Published: 01 June 2005 Publication History

Abstract

In this article we describe our implementations of the GMRES algorithm for both real and complex, single and double precision arithmetics suitable for serial, shared memory and distributed memory computers. For the sake of portability, simplicity, flexibility and efficiency the GMRES solvers have been implemented in Fortran 77 using the reverse communication mechanism for the matrix-vector product, the preconditioning and the dot product computations. For distributed memory computation, several orthogonalization procedures have been implemented to reduce the cost of the dot product calculation, which is a well-known bottleneck of efficiency for the Krylov methods. Either implicit or explicit calculation of the residual at restart are possible depending on the actual cost of the matrix-vector product. Finally the implemented stopping criterion is based on a normwise backward error.

Supplementary Material

ZIP File (842.zip)
Software for "A set of GMRES routines for real and complex arithmetics on high performance computers"

References

[1]
Balay, S., Buschelman, K., Eijkhout, V., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Smith, B. F., and Zhang, H. 2004. PETSc users manual. Tech. Rep. ANL-95/11 - Revision 2.1.5, Argonne National Laboratory.
[2]
Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., and der Vorst, H. V. 1994. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, Second ed. SIAM, Philadelphia, PA.
[3]
Bindel, D., Demmel, J., Kahan, W., and Marques, O. 2002. On computing givens rotations reliably and efficiently. ACM Transactions on Mathematical Software (TOMS) 28, 2 (June), 206--238.
[4]
Björck, Å. 1994. Numerics of Gram-Schmidt orthogonalization. Linear Algebra Appl. 197--198, 297--316.
[5]
Björck, Å. 1996. Numerical Methods for Least Squares Problems. SIAM, Philadelphia.
[6]
Blackford, L. S., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M., Kaufman, L., Lumsdaine, A., Petitet, A., Pozo, R., Remington, K., and Whaley, R. C. 2002. An updated set of Basic Linear Algebra Subprograms (BLAS). ACM Trans. Math. Softw. (TOMS) 28, 2 (June), 135--151.
[7]
Buchau, A. and Rucker, W. M. 2002. Preconditioned fast adaptive multipole boundary element method. IEEE Trans. Magn. 38, 2, 461--464.
[8]
Chaitin-Chatelin, F. and Frayss&ecute;, V. 1996. Lectures on Finite Precision Computations. SIAM, Philadelphia.
[9]
Cross, J. T., Masters, I., and Lewis, R. W. 1999. Why you should consider object-oriented programming techniques for finite element methods. Int. J. Numer. Meth. Heat Fluid Flow 9, 333--347.
[10]
Daniel, W., Gragg, W. B., Kaufman, L., and Stewart, G. W. 1976. Reorthogonalization and stable algorithms for updating the Gram-Schmidt QR factorization. Math. Comp. 30, 772--795.
[11]
Frank, J. and Vuik, C. 1999. Parallel implementation of a multiblock method with approximate subdomain solution. Appl. Num. Math. 30, 403--423.
[12]
Frayss&ecute;, V., Giraud, L., Gratton, S., and Langou, J. 2003. A set of GMRES routines for real and complex arithmetics on high performance computers. Tech. Rep. TR/PA/03/03, CERFACS. Available on http://www.cerfacs.fr/algor.
[13]
Frayss&ecute;, V., Giraud, L., and Kharraz-Aroussi, H. 1998. On the influence of the orthogonalization scheme on the parallel performance of GMRES. Tech. Rep. TR/PA/98/07, CERFACS, Toulouse, France. Preliminary version of the paper published in the proceedings of EuroPar'98. Lecture Notes in Computer Science, Springer-Verlag, vol. 1470, pp. 751--762.
[14]
Higham, N. J. 2002. Accuracy and Stability of Numerical Algorithms, Second ed. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA.
[15]
Hoffmann, W. 1989. Iterative algorithms for Gram-Schmidt orthogonalization. Computing 41, 335--348.
[16]
Hustedt, B., Operto, S., and Virieux, J. 2003. A multi-level direct-iterative solver for seismic wave propagation modelling: space and wavelet approaches. Geophys. Int. J. 155, 953--980.
[17]
Lehoucq, R. B. and Salinger, A. G. 2001. Large-scale eigenvalue calculations for stability analysis of steady flows on massively parallel computers. Int. J. Numer. Meth. Fluids 36, 309--327.
[18]
Lewis, R. W., Masters, I., and Cross, J. T. 1997. Automatic timestep selection for the super-time-stepping acceleration on unstructured grids using object-oriented programming. Comm. Numer. Meth. Eng. 13, 4, 249--260.
[19]
Li, Z., Saad, Y., and Sosonkina, M. 2003. pARMS: a parallel version of the recursive multilevel solver. Numer. Linear Alg. Appl. 10, 485--509.
[20]
Masters, I., Usmani, A. S., Cross, J. T., and Lewis, R. W. 1997. Finite element analysis of solidification using object-oriented and parallel techniques. Int. J. Numer. Meth. Eng. 40, 15, 2891--2909.
[21]
Monga-Made, M. M. 2001. Incomplete factorization based preconditionings for solving the Helmholtz equation. Int. J. Numer. Meth. Eng. 50, 5, 1077--1101.
[22]
Monga-Made, M. M. and Beauwens, R. 2000. Imaginary diagonal relaxations for highly indefinite linear systems. In Proceedings of the 16th IMACS World Congress 2000. IMACS, Department of Computer Science, Rutgers University.
[23]
Monga-Made, M. M., Beauwens, R., and Warz´e, G. 2000. Preconditioning of discrete Helmholtz operators perturbed by a diagonal complex matrix. Comm. Numer. Meth. Eng. 16, 2, 801--817.
[24]
Rutishauser, H. 1967. Description of Algol 60. Handbook for Automatic Computation, vol. 1. Springer-Verlag, Berlin.
[25]
Saad, Y. and Schultz, M. 1986. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7, 856--869.
[26]
Shadid, J. N. and Tuminaro, R. S. 1994. A comparison of preconditioned nonsymmetric Krylov methods on a large-scale MIMD machine. SIAM J. Sci. Comp. 14, 2, 440--459.
[27]
Sylvand, G. 2002. R´solution it´rative de formulation int´grale pour Helmholtz 3D: Applications de la m´thode multipôle à des problèmes de grande taille. Ph.D. thesis, Ecole Nationale des Ponts et Chauss´es.
[28]
Tuminaro, R. S., Heroux, M., Hutchinson, S. A., and Shadid, J. 1999. Official Aztec User's Guide - Version 2.1. Tech. Rep. 99-8801J, Sandia National Laboratories.
[29]
Warsa, J. S., Benzi, M., Wareing, T. A., and Morel, J. E. 2004. Preconditioning a mixed discontinuous finite element method for radiation diffusion. Numerical Linear Algebra with Applications 11, 795--811.
[30]
West, J. C. and Sturm, J. M. 1999. On iterative approaches for electromagnetic rough-surface scattering problems. IEEE Trans. Antennas and Propagation 47, 8, 1281--1288.
[31]
Wilkinson, J. H. 1963. Rounding errors in algebraic processes. Vol. 32. Her Majesty's stationery office, London.
[32]
Yoshida, K., Nishimura, N., and Kobayashi, S. 2000. Analysis of three dimensional scattering of elastic waves by cracks with fast multipole boundary integral equation method. J. Appl. Mech. JSCE 3, 145--150.
[33]
Yoshida, K., Nishimura, N., and Kobayashi, S. 2001. Analysis of three dimensional scattering of scalar waves by cracks with fast multipole boundary integral equation method. Trans JSME A, 16--22.

Cited By

View all
  • (2024)Frequency Domain Thermal Analysis of Future Spacecraft and SystemsJournal of Thermophysics and Heat Transfer10.2514/1.T6999(1-11)Online publication date: 25-Sep-2024
  • (2024)Valleytronics and negative differential resistance in cubic boron nitride: A first-principles studyPhysical Review Materials10.1103/PhysRevMaterials.8.1046028:10Online publication date: 8-Oct-2024
  • (2024)Hot electron diffusion, microwave noise, and piezoresistivity in Si from first principlesPhysical Review B10.1103/PhysRevB.109.235201109:23Online publication date: 3-Jun-2024
  • Show More Cited By

Index Terms

  1. Algorithm 842: A set of GMRES routines for real and complex arithmetics on high performance computers

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Mathematical Software
      ACM Transactions on Mathematical Software  Volume 31, Issue 2
      June 2005
      93 pages
      ISSN:0098-3500
      EISSN:1557-7295
      DOI:10.1145/1067967
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 June 2005
      Published in TOMS Volume 31, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Badges

      Author Tags

      1. GMRES
      2. Krylov methods
      3. Linear systems
      4. distributed memory
      5. reverse communication

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)12
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 17 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Frequency Domain Thermal Analysis of Future Spacecraft and SystemsJournal of Thermophysics and Heat Transfer10.2514/1.T6999(1-11)Online publication date: 25-Sep-2024
      • (2024)Valleytronics and negative differential resistance in cubic boron nitride: A first-principles studyPhysical Review Materials10.1103/PhysRevMaterials.8.1046028:10Online publication date: 8-Oct-2024
      • (2024)Hot electron diffusion, microwave noise, and piezoresistivity in Si from first principlesPhysical Review B10.1103/PhysRevB.109.235201109:23Online publication date: 3-Jun-2024
      • (2023)Two-phonon scattering in nonpolar semiconductors: A first-principles study of warm electron transport in SiPhysical Review B10.1103/PhysRevB.107.L041110107:4Online publication date: 25-Jan-2023
      • (2023)Simulation of realistic speckle fields by using surface integral equation and multi-level fast multipole methodOptics and Lasers in Engineering10.1016/j.optlaseng.2022.107438162(107438)Online publication date: Mar-2023
      • (2023)An SQP-based multiple shooting algorithm for large-scale PDE-constrained optimal control problemsJournal of Computational Physics10.1016/j.jcp.2023.111927477:COnline publication date: 15-Mar-2023
      • (2023)ITVOLT: An iterative solver for the time-dependent Schrödinger equationComputer Physics Communications10.1016/j.cpc.2023.108780291(108780)Online publication date: Oct-2023
      • (2023)Low-rank compression techniques in integral methods for eddy currents problemsComputer Physics Communications10.1016/j.cpc.2023.108756289(108756)Online publication date: Aug-2023
      • (2023)GMRES algorithms over 35 yearsApplied Mathematics and Computation10.1016/j.amc.2023.127869445:COnline publication date: 15-May-2023
      • (2022)Solving Electromagnetic Scattering Problems With Tens of Billions of Unknowns Using GPU Accelerated Massively Parallel MLFMAIEEE Transactions on Antennas and Propagation10.1109/TAP.2022.316152070:7(5672-5682)Online publication date: Jul-2022
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media