skip to main content
research-article

Algorithm 881: A Set of Flexible GMRES Routines for Real and Complex Arithmetics on High-Performance Computers

Published: 01 July 2008 Publication History

Abstract

In this article we describe our implementations of the FGMRES algorithm for both real and complex, single and double precision arithmetics suitable for serial, shared-memory, and distributed-memory computers. For the sake of portability, simplicity, flexibility, and efficiency, the FGMRES solvers have been implemented in Fortran 77 using the reverse communication mechanism for the matrix-vector product, the preconditioning, and the dot-product computations. For distributed-memory computation, several orthogonalization procedures have been implemented to reduce the cost of the dot-product calculation, which is a well-known bottleneck of efficiency for Krylov methods. Furthermore, either implicit or explicit calculation of the residual at restart is possible depending on the actual cost of the matrix-vector product. Finally, the implemented stopping criterion is based on a normwise backward error.

Supplementary Material

ZIP File (881.zip)
Software for A Set of Flexible GMRES Routines for Real and Complex Arithmetics on High-Performance Computers

References

[1]
Balay, S., Buschelman, K., Eijkhout, V., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Smith, B. F., and Zhang, H. 2004. PETSc users manual. Tech. Rep. ANL-95/11 - Revision 2.1.5, Argonne National Laboratory.
[2]
Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., and der Vorst, H. V. 1994. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, 2nd ed. SIAM, Philadelphia, PA.
[3]
Batiste, O., Alonso, A., and Mercader, I. 2004. Hydrodynamic stability of binary mixtures in Bénard and thermogravitational cells. J. Non-Equilib. Thermodyn. 29, 359--375.
[4]
Bindel, D., Demmel, J., Kahan, W., and Marques, O. 2002. On computing Givens rotations reliably and efficiently. ACM Trans. Math. Softw. 28, 2 (Jun.), 206--238.
[5]
Björck, Å. 1994. Numerics of Gram-Schmidt orthogonalization. Linear Algebra Appl. 197-198, 297--316.
[6]
Björck, Å. 1996. Numerical Methods for Least Squares Problems. SIAM, Philadelphia, PA.
[7]
Blackford, L. S., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M., Kaufman, L., Lumsdaine, A., Petitet, A., Pozo, R., Remington, K., and Whaley, R. C. 2002. An updated set of basic linear algebra subprograms (BLAS). ACM Trans. Math. Softw. 28, 2 (Jun.), 135--151.
[8]
Bouras, A. and Frayssé, V. 2005. Inexact matrix-vector products in Krylov methods for solving linear systems: A relaxation strategy. SIAM J. Matrix Anal. Appl. 26, 23, 660--678.
[9]
Carpentieri, B., Duff, I. S., Giraud, L., and Sylvand, G. 2005. Combining fast multipole techniques and an approximate inverse preconditioner for large electromagnetism calculations. SIAM J. Sci. Comput. 27, 3, 774--792.
[10]
Chaitin-Chatelin, F. and Frayssé, V. 1996. Lectures on Finite Precision Computations. SIAM, Philadelphia, PA.
[11]
Frank, J. and Vuik, C. 1999. Parallel implementation of a multiblock method with approximate subdomain solution. Appl. Numer. Math. 30, 403--423.
[12]
Frayssé, V., Giraud, L., Gratton, S., and Langou, J. 2003. A set of GMRES routines for real and complex arithmetics on high performance computers. Tech. Rep. TR/PA/03/03, CERFACS, Toulouse, France. http://www.cerfacs.fr/algor.
[13]
Frayssé, V., Giraud, L., Gratton, S., and Langou, J. 2005. Algorithm 842: A set of GMRES routines for real and complex arithmetics on high performance computers. ACM Trans. Math. Softw. 31, 2, 228--238.
[14]
Frayssé, V., Giraud, L., and Kharraz-Aroussi, H. 1998. On the influence of the orthogonalization scheme on the parallel performance of GMRES. Tech. Rep. TR/PA/98/07, CERFACS, Toulouse, France. Preliminary version of the paper published in the Proceedings of the EuroPar. Lecture Notes in Computer Science, Springer, vol. 1470, 751--762.
[15]
Giraud, L., Gratton, S., and Langou, J. 2007. Convergence in backward error of relaxed GMRES. SIAM J. Sci. Comput. 29, 2, 710--728.
[16]
Higham, N. J. 2002. Accuracy and Stability of Numerical Algorithms, 2nd ed. Society for Industrial and Applied Mathematics, Philadelphia, PA.
[17]
Hoffmann, W. 1989. Iterative algorithms for Gram-Schmidt orthogonalization. Comput. 41, 335--348.
[18]
Lehoucq, R. B. and Salinger, A. G. 2001. Large-Scale eigenvalue calculations for stability analysis of steady flows on massively parallel computers. Int. J. Numer. Methods Fluids 36, 309--327.
[19]
Li, Z., Saad, Y., and Sosonkina, M. 2003. pARMS: A parallel version of the recursive multilevel solver. Numer. Linear Algebra Appl. 10, 485--509.
[20]
Maschho, K. J. and Sorensen, D. C. 1996. A portable implementation of ARPACK for distributed memory parallel computers. In Proceedings of the Copper Mountain Conference on Iterative Methods.
[21]
Meca, E., Mercader, I., Batiste, O., and rez de la Piscina, L. R. 2004a. A blue sky catastrophe in double-diffussive convection. Phys. Rev. Lett. 92, 23, 1--4.
[22]
Meca, E., Mercader, I., Batiste, O., and rez de la Piscina, L. R. 2004b. Complex dynamics in double-diffusive convection. Theor. Comput. Fluid Dynam. 18, 231--238.
[23]
Mercader, I., Alonso, A., and Batiste, O. 2004. Numerical analysis of the Eckhaus instability in travelling-wave convection in binary mixtures. Eur. Phys. J. E 15, 319--333.
[24]
Mercader, I., Batiste, O., rez de la Piscina, L. R., Ruiz, X., Rüdiger, S., and Casademunt, J. 2005. Bifurcations and chaos in single-roll natural convection with low Prandtl number. Phys. Fluids. submitted.
[25]
Rutishauser, H. 1967. Description of algol 60. handbook for automatic computation. Springer, Berlin, 1.a.
[26]
Saad, Y. 1993. A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Comput. 14, 461--469.
[27]
Saad, Y. and Schultz, M. 1986. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statis. Comput. 7, 856--869.
[28]
Shadid, J. N. and Tuminaro, R. S. 1994. A comparison of preconditioned nonsymmetric Krylov methods on a large-scale MIMD machine. SIAM J. Sci. Comput. 14, 2, 440--459.
[29]
Simoncini, V. and Szyld, D. B. 2003. Theory of inexact Krylov subspace methods and applications to scientific computing. SIAM J. Sci. Comput. 25, 454--477.
[30]
Strakoš, Z. and Tichy, P. 2006. Error estimation in preconditioned conjugate gradients. BIT. accepted.
[31]
Tuminaro, R. S., Heroux, M., Hutchinson, S. A., and Shadid, J. 1999. Official Aztec user's guide - version 2.1. Tech. Rep. 99-8801J, Sandia National Laboratories.
[32]
van den Eshof, J. and Sleijpen, G. L. G. 2004. Inexact Krylov subspace methods for linear systems. SIAM J. Matrix Anal. Appl. 26, 1, 125--153.
[33]
Warsa, J. S., Benzi, M., Wareing, T. A., and Morel, J. E. 2004. Preconditioning a mixed discontinuous finite element method for radiation diffusion. Numer. Linear Algebra Appl. 11, 795--811.
[34]
Wilkinson, J. H. 1963. Rounding Errors in Algebraic Processes, vol. 32. Her Majesty's Stationery Office, London, UK.

Cited By

View all
  • (2024)Recycling Krylov subspaces for efficient partitioned solution of aerostructural adjoint systemsJournal of Computational Physics10.1016/j.jcp.2024.113142512(113142)Online publication date: Sep-2024
  • (2023)GMRES algorithms over 35 yearsApplied Mathematics and Computation10.1016/j.amc.2023.127869445(127869)Online publication date: May-2023
  • (2022)Kernel-Independent Fast Multipole Boundary Element Solver for Coupled Conduction–Radiation Heat Transfer ProblemJournal of Thermophysics and Heat Transfer10.2514/1.T654636:4(818-823)Online publication date: Oct-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Mathematical Software
ACM Transactions on Mathematical Software  Volume 35, Issue 2
July 2008
144 pages
ISSN:0098-3500
EISSN:1557-7295
DOI:10.1145/1377612
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2008
Accepted: 01 October 2007
Revised: 01 August 2007
Received: 01 February 2006
Published in TOMS Volume 35, Issue 2

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. FGMRES
  2. Linear systems
  3. distributed memory
  4. flexible Krylov methods
  5. high-performance computing
  6. reverse communication

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Recycling Krylov subspaces for efficient partitioned solution of aerostructural adjoint systemsJournal of Computational Physics10.1016/j.jcp.2024.113142512(113142)Online publication date: Sep-2024
  • (2023)GMRES algorithms over 35 yearsApplied Mathematics and Computation10.1016/j.amc.2023.127869445(127869)Online publication date: May-2023
  • (2022)Kernel-Independent Fast Multipole Boundary Element Solver for Coupled Conduction–Radiation Heat Transfer ProblemJournal of Thermophysics and Heat Transfer10.2514/1.T654636:4(818-823)Online publication date: Oct-2022
  • (2022)Comparative study of inner–outer Krylov solvers for linear systems in structured and high-order unstructured CFD problemsComputers & Fluids10.1016/j.compfluid.2022.105575244(105575)Online publication date: Aug-2022
  • (2015)On the Resilience of Parallel Sparse Hybrid SolversProceedings of the 2015 IEEE 22nd International Conference on High Performance Computing (HiPC)10.1109/HiPC.2015.9(75-84)Online publication date: 16-Dec-2015
  • (2015)Computation of electron energy loss spectra by an iterative methodNuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms10.1016/j.nimb.2014.11.080354(216-219)Online publication date: Jul-2015
  • (2013)Inner-Iteration Krylov Subspace Methods for Least Squares ProblemsSIAM Journal on Matrix Analysis and Applications10.1137/11082847234:1(1-22)Online publication date: Jan-2013
  • (2012)FAST MULTIPOLE BOUNDARY ELEMENT METHOD FOR THREE DIMENSIONAL ELECTROMAGNETIC SCATTERING PROBLEMInternational Journal of Computational Materials Science and Engineering10.1142/S204768411250038801:04(1250038)Online publication date: Dec-2012
  • (2012)Extension to Nonconforming Meshes of the Combined Current and Charge Integral EquationIEEE Transactions on Antennas and Propagation10.1109/TAP.2012.220731660:10(4732-4744)Online publication date: Oct-2012
  • (2008)Parallel algebraic hybrid solvers for large 3D convection-diffusion problemsNumerical Algorithms10.1007/s11075-008-9248-x51:2(151-177)Online publication date: 14-Nov-2008

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media