Article

Sparse matrix solvers on the GPU: conjugate gradients and multigrid

Authors:

Eitan Grinspun,

Peter SchröderAuthors Info & Claims

SIGGRAPH '05: ACM SIGGRAPH 2005 Courses

Pages 171 - es

https://doi.org/10.1145/1198555.1198781

Published: 31 July 2005 Publication History

Abstract

Many computer graphics applications require high-intensity numerical simulation. We show that such computations can be performed efficiently on the GPU, which we regard as a full function streaming processor with high floating-point performance. We implemented two basic, broadly useful, computational kernels: a sparse matrix conjugate gradient solver and a regular-grid multigrid solver. Real-time applications ranging from mesh smoothing and parameterization to fluid solvers and solid mechanics can greatly benefit from these, evidence our example applications of geometric flow and fluid simulation running on NVIDIA's GeForce FX.

References

[1]

Arvind, and Iannucci, R. A. 1987. Two Fundamental Issues in Multiprocessing. In Proceedings of DFVLR - Conference 1987, Parallel Processing in Science and Engineering.

Digital Library

[2]

Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., and der Vorst, H. V. 1994. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, 2nd ed. SIAM.

[3]

Blelloch, G. E. 1990. Vector Models for Data-Parallel Computing. MIT Press.

Digital Library

[4]

Briggs, W. L., Henson, V. E., and McCormack, S. F. 2000. A Multigrid Tutorial, 2nd ed. SIAM.

Digital Library

[5]

Carr, N. A., Hall, J. D., and Hart, J. C. 2002. The Ray Engine. In SIGGRAPH/Eurographics Workshop on Graphics Hardware.

Digital Library

[6]

Cohen, M. F., Chen, S. E., Wallace, J. R., and Greenberg, D. P. 1988. A Progressive Refinement Approach to Fast Radiosity Image Generation. Computer Graphics (Proceedings of SIGGRAPH) 22, 75--84.

Digital Library

[7]

Demmel, J. W. 1997. Applied Numerical Linear Algebra. SIAM, Philadelphia, PA.

Digital Library

[8]

Desbrun, M., Meyer, M., Schröder, P., and Barr, A. H. 1999. Implicit Fairing of Irregular Meshes Using Diffusion and Curvature Flow. In Proceedings of SIGGRAPH, 317--324.

Digital Library

[9]

Desbrun, M., Meyer, M., and Alliez, P. 2002. Intrinsic Parameterizations of Surface Meshes. Computer Graphics Forum (Proceedings of Eurographics) 21, 3, 209--218.

[10]

Diewald, U., Morigi, S., and Rumpf, M. 2002. A Cascadic Geometric Filtering Approach to Subdivision. Comput. Aided Geom. Des. 19, 9, 675--694.

Digital Library

[11]

Fedkiw, R., Stam, J., and Jensen, H. W. 2001. Visual Simulation of Smoke. In Proceedings of SIGGRAPH, 15--22.

Digital Library

[12]

Goodnight, N., Lewin, G., Luebke, D., and Skadron, K. 2003. A Multigrid Solver for Boundary Value Problems Using Programmable Graphics Hardware. Tech. Rep. CS-2003-03, University of Virginia.

[13]

Hackbusch, W. 1985. Multi-Grid Methods and Applications. Springer Verlag, Berlin.

[14]

Hall, J. D., Carr, N. A., and Hart, J. C. 2003. Cache and Band-width Aware Matrix Multiplication on the GPU. Tech. rep., University of Illinois.

[15]

Harris, M. J., Coombe, G., Scheuermann, T., and Lastra, A. 2002. Physically-Based Visual Simulation on Graphics Hardware. In SIGGRAPH/Eurographics Workshop on Graphics Hardware.

Digital Library

[16]

Harris, M. J. 2002. Analysis of Error in a CML Diffusion Operation. Tech. Rep. TR02-015, UNC Chapel Hill.

[17]

Hillesland, K. E., Molinov, S., and Grzeszczuk, R. 2003. Nonlinear Optimization Framework for Image-Based Modeling on Programmable Graphics Hardware. ACM Transactions on Graphics.

Digital Library

[18]

Hoff, K. E., Keyser, J., Lin, M., Manocha, D., and Culver, T. 1999. Fast Computation of Generalized Voronoi Diagrams Using Graphics Hardware. In Proceedings of SIGGRAPH, 277--286.

Digital Library

[19]

Kass, M., and Miller, G. 1990. Rapid, Stable Fluid Dynamics for Computer Graphics. Computer Graphics (Proceedings of SIGGRAPH) 24, 4, 49--57.

Digital Library

[20]

Keller, A. 1997. Instant Radiosity. In Proceedings of SIGGRAPH, 49--56.

Digital Library

[21]

Khailany, B., Dally, W. J., Rixner, S., Kapasi, U. J., Mattson, P., Namkoong, J., Owens, J. D., Towles, B., and Chang, A. 2001. Imagine: Media Processing with Streams. IEEE Micro 21, 2, 35--46.

Digital Library

[22]

Khailany, B., Dally, W. J., Rixner, S., Kapasi, U. J., Owens, J. D., and Towles, B. 2003. Exploring the VLSI Scalability of Stream Processors. In Proceedings of the Ninth Symposium on High Performance Computer Architecture.

Digital Library

[23]

Kobbelt, L., Campagna, S., Vorsatz, J., and Seidel, H.-P. 1998. Interactive Multi-Resolution Modeling on Arbitrary Meshes. In Proceedings of SIGGRAPH, 105--114.

Digital Library

[24]

Krüger, J., and Westermann, R. 2003. Linear algebra operators for gpu implementation of numerical algorithms. ACM Transactions on Graphics.

Digital Library

[25]

Larsen, E. S., and McAllister, D. K. 2001. Fast Matrix Multiplies using Graphics Hardware. In Supercomputing.

Digital Library

[26]

Leiserson, C., Rose, F., and Saxe, J. 1993. Optimizing synchronous circuitry by retiming. In Third Caltech Conference On VLSI.

[27]

Lengyel, J., Reichert, M., Donald, B. R., and Greenberg, D. P. 1990. Real-Time Robot Motion Planning Using Rasterizing Computer Graphics Hardware. Computer Graphics (Proceedings of SIGGRAPH) 24, 327--335.

Digital Library

[28]

Li, W., Wei, X., and Kaufman, A. 2003. Implementing Lattice Boltzmann Comptuation on Graphics Hardware. The Visual Computer. To appear.

[29]

Lindholm, E., Kilgard, M. J., and Moreton, H. 2001. A User-Programmable Vertex Engine. In Proceedings of SIGGRAPH, 149--158.

Digital Library

[30]

Müller, M., Dorsey, J., McMillan, L., Jagnow, R., and Cutler, B. 2002. Stable Real-Time Deformations. In ACM SIGGRAPH Symposium on Computer Animation, 49--54.

Digital Library

[31]

Olano, M., Ed. 2002. Real-Time Shading Languages. Course Notes. ACM SIGGRAPH.

Digital Library

[32]

Owens, J. D., Khailany, B., Towles, B., and Dally, W. J. 2002. Comparing Reyes and OpenGL on a Stream Architecture. In SIGGRAPH/Eurographics Workshop on Graphics Hardware, 47--56.

Digital Library

[33]

Purcell, T. J., Buck, I., Mark, W. R., and Hanrahan, P. 2002. Ray Tracing on Programmable Graphics Hardware. ACM Transactions on Graphics 21, 3, 703--712.

Digital Library

[34]

Semiconductor Industry Association, 2002. International Technology Roadmap for Semiconductors. http://public.itrs.net/, December.

[35]

Shewchuck, J. R. 1994. An Introduction to the Conjugate Gradient Method without the Agonizing Pain. http://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.ps., August.

[36]

Stam, J. 1999. Stable Fluids. In Proceedings of SIGGRAPH, 121--128.

Digital Library

[37]

Strzodka, R., and Rumpf, M. 2001. Nonlinear Diffusion in Graphics Hardware. In Visualization, 75--84.

Digital Library

[38]

Strzodka, R. 2002. Virtual 16 Bit Precise Operations on RGBA8 Textures. In Vision, Modeling and Visualization.

[39]

The C* Team. 1993. C* Manual, 2nd ed. Thinking Machines Corporation.

[40]

Thompson, C. J., Hahn, S., and Oskin, M. 2002. Using Modern Graphics Architectures for General-Purpose Computing: A Framework and Analysis. In International Symposium on Microarchitecture.

Digital Library

Cited By

Guojun ZRunjin LGuozhe SXiangkui Z(2024)A parallel acceleration GPU algorithm for large deformation of thin shell structures based on peridynamicsEngineering with Computers10.1007/s00366-024-01951-x40:5(3009-3030)Online publication date: 27-Mar-2024
https://dl.acm.org/doi/10.1007/s00366-024-01951-x
Kasmi NZbakh MHaouari A(2018)Performance Analysis of Preconditioned Conjugate Gradient Solver on Heterogeneous (Multi-CPUs/Multi-GPUs) ArchitectureCloud Computing and Big Data: Technologies, Applications and Security10.1007/978-3-319-97719-5_20(318-336)Online publication date: 28-Jul-2018
https://doi.org/10.1007/978-3-319-97719-5_20
Henz BLimberger FOliveira M(2016)Independent color-channel adjustment for seamless cloning based on Laplacian-membrane modulationComputers and Graphics10.1016/j.cag.2016.03.00457:C(46-54)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.1016/j.cag.2016.03.004
Show More Cited By

Index Terms

Sparse matrix solvers on the GPU: conjugate gradients and multigrid
1. Computing methodologies
  1. Computer graphics
    1. Graphics systems and interfaces
      1. Graphics processors
  2. Symbolic and algebraic manipulation
    1. Symbolic and algebraic algorithms
      1. Linear algebra algorithms
2. Mathematics of computing
  1. Mathematical analysis
    1. Differential equations
      1. Partial differential equations
    2. Numerical analysis
      1. Computations on matrices

Recommendations

Sparse matrix solvers on the GPU: conjugate gradients and multigrid

Many computer graphics applications require high-intensity numerical simulation. We show that such computations can be performed efficiently on the GPU, which we regard as a full function streaming processor with high floating-point performance. We ...
Sparse matrix solvers on the GPU: conjugate gradients and multigrid
SIGGRAPH '03: ACM SIGGRAPH 2003 Papers

Many computer graphics applications require high-intensity numerical simulation. We show that such computations can be performed efficiently on the GPU, which we regard as a full function streaming processor with high floating-point performance. We ...
Interactive smoke simulation and rendering on the GPU
VRCAI '13: Proceedings of the 12th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and Its Applications in Industry

We present a fluid dynamic system, which allows to moving objects and real-time simulations of large scale three dimensional smokes. Our system includes a GPU accelerated Eulerian fluid solver which is well suited for real-time application since it is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGGRAPH '05: ACM SIGGRAPH 2005 Courses

July 2005

7157 pages

ISBN:9781450378338

DOI:10.1145/1198555

Editor:
John Fujii
Hewlett-Packard Company

Copyright © 2005 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 July 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 1,822 of 8,601 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
789
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Guojun ZRunjin LGuozhe SXiangkui Z(2024)A parallel acceleration GPU algorithm for large deformation of thin shell structures based on peridynamicsEngineering with Computers10.1007/s00366-024-01951-x40:5(3009-3030)Online publication date: 27-Mar-2024
https://dl.acm.org/doi/10.1007/s00366-024-01951-x
Kasmi NZbakh MHaouari A(2018)Performance Analysis of Preconditioned Conjugate Gradient Solver on Heterogeneous (Multi-CPUs/Multi-GPUs) ArchitectureCloud Computing and Big Data: Technologies, Applications and Security10.1007/978-3-319-97719-5_20(318-336)Online publication date: 28-Jul-2018
https://doi.org/10.1007/978-3-319-97719-5_20
Henz BLimberger FOliveira M(2016)Independent color-channel adjustment for seamless cloning based on Laplacian-membrane modulationComputers and Graphics10.1016/j.cag.2016.03.00457:C(46-54)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.1016/j.cag.2016.03.004
Henz BLimberger FOliveira M(2015)Color Adjustment for Seamless Cloning Based on Laplacian-Membrane ModulationProceedings of the 2015 28th SIBGRAPI Conference on Graphics, Patterns and Images10.1109/SIBGRAPI.2015.9(196-202)Online publication date: 26-Aug-2015
https://dl.acm.org/doi/10.1109/SIBGRAPI.2015.9
Magoulès FAhamed ADesmaison ALéchenet JMayer FSalem HZhu T(2014)Power Consumption Analysis of Parallel Algorithms on GPUsProceedings of the 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS)10.1109/HPCC.2014.54(304-311)Online publication date: 20-Aug-2014
https://dl.acm.org/doi/10.1109/HPCC.2014.54
Kamiabad ATate J(2013)Polynomial Preconditioning of Power System Matrices with Graphics Processing UnitsHigh Performance Computing in Power and Energy Systems10.1007/978-3-642-32683-7_8(229-246)Online publication date: 2013
https://doi.org/10.1007/978-3-642-32683-7_8
Xu WZhang HJiao SWang DSong FLiu Z(2012)Optimizing Sparse Matrix Vector Multiplication Using Cache Blocking Method on Fermi GPUProceedings of the 2012 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing10.1109/SNPD.2012.20(231-235)Online publication date: 8-Aug-2012
https://dl.acm.org/doi/10.1109/SNPD.2012.20
Pichel JRivera FFernández MRodríguez A(2012)Optimization of sparse matrix-vector multiplication using reordering techniques on GPUsMicroprocessors & Microsystems10.1016/j.micpro.2011.05.00536:2(65-77)Online publication date: 1-Mar-2012
https://dl.acm.org/doi/10.1016/j.micpro.2011.05.005
Linford JMichalakes JVachharajani MSandu A(2011)Automatic Generation of Multicore Chemical KernelsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2010.10622:1(119-131)Online publication date: 1-Jan-2011
https://dl.acm.org/doi/10.1109/TPDS.2010.106
Binotto ADaniel CWeber DKuijper AStork APereira CFellner D(2010)Iterative SLE Solvers over a CPU-GPU PlatformProceedings of the 2010 IEEE 12th International Conference on High Performance Computing and Communications10.1109/HPCC.2010.40(305-313)Online publication date: 1-Sep-2010
https://dl.acm.org/doi/10.1109/HPCC.2010.40
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten