ABSTRACT
The advent of systems biology requires the simulation of ever-larger biomolecular systems, demanding a commensurate growth in computational power. This paper examines the use of the NVIDIA Tesla C870 graphics card programmed through the CUDA toolkit to accelerate the calculation of cutoff pair potentials, one of the most prevalent computations required by many different molecular modeling applications. We present algorithms to calculate electrostatic potential maps for cutoff pair potentials. Whereas a straightforward approach for decomposing atom data leads to low compute efficiency, a newer strategy enables fine-grained spatial decomposition of atom data that maps efficiently to the C870's memory system while increasing work-efficiency of atom data traversal by a factor of 5. The memory addressing flexibility exposed through CUDA's SPMD programming model is crucial in enabling this new strategy. An implementation of the new algorithm provides a greater than threefold performance improvement over our previously published implementation and runs 12 to 20 times faster than optimized CPU-only code. The lessons learned are generally applicable to algorithms accelerated by uniform grid spatial decomposition.
- Advanced Micro Devices, Inc. ATI CTM Guide, version 1.01, 2006.Google Scholar
- T. Amada, M. Imura, Y. Yasumuro, Y. Manabe, and K. Chihara. Particle-based fluid simulation on GPU. In ACM Workshop on General-Purpose Computing on Graphics Processors, 2004.Google Scholar
- D. H. Bailey. High-precision floating-point arithmetic in scientific computation. Computing in Science and Engineering, 07(3):54--61, 2005. Google ScholarDigital Library
- I. Buck. Case studies: Ray tracing and molecular dynamics. In IEEE Visualization 2004 GPGPU Tutorial. IEEE Computer Society, 2004.Google Scholar
- I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan. Brook for GPUs: stream computing on graphics hardware. In SIGGRAPH '04: ACM SIGGRAPH 2004 Papers, pages 777--786, New York, NY, USA, 2004. ACM Press. Google ScholarDigital Library
- T. A. Darden, D. M. York, and L. G. Pedersen. Particle mesh Ewald. An N.log(N) method for Ewald sums in large systems. Journal of Chemical Physics, 98:10089--10092, 1993.Google ScholarCross Ref
- E. Elsen, M. Houston, V. Vishal, E. Darve, P. Hanrahan, and V. Pande. N-body simulation on GPUs. In SC06 Proceedings. IEEE Computer Society, 2006. Google ScholarDigital Library
- C. Ericson. Real-Time Collision Detection. Morgan Kaufmann series in interactive 3D technology. Morgan Kaufman, San Francisco, CA, 2005. Google ScholarDigital Library
- T. Harada, S. Koshizuka, and Y. Kawaguchi. Smoothed particle hydrodynamics on GPUs. In Proceedings of the 25th Computer Graphics International Conference, May 2007.Google Scholar
- Y. He and C. H. Q. Ding. Using accurate arithmetics to improve numerical reproducibility and stability in parallel applications. In ICS '00: Proceedings of the 14th international conference on Supercomputing, pages 225--234, New York, NY, USA, 2000. ACM Press. Google ScholarDigital Library
- R. W. Hockney and J. W. Eastwood. Computer Simulation Using Particles. McGraw-Hill, New York, 1981. Google ScholarDigital Library
- W. Humphrey, A. Dalke, and K. Schulten. VMD . Visual Molecular Dynamics. Journal of Molecular Graphics, 14:33--38, 1996.Google ScholarCross Ref
- L. Kalé, R. Skeel, M. Bhandarkar, R. Brunner, A. Gursoy, N. Krawetz, J. Phillips, A. Shinozaki, K. Varadarajan, and K. Schulten. NAMD2: Greater scalability for parallel molecular dynamics. Journal of Computational Physics, 151:283--312, 1999. Google ScholarDigital Library
- K. Kennedy and R. Allen. Optimizing Compilers for Modern Architectures: A Dependence-based approach. Morgan Kaufmann Publishers, San Francisco, CA, 2002. Google ScholarDigital Library
- A. Kolb and N. Cuntz. Dynamic particle coupling for GPU-based fluid simulation. In Proceedings of the 18th Symposium on Simulation Technique, pages 722--727, September 2005.Google Scholar
- J. S. Meredith, S. R. Alam, and J. S. Vetter. Analysis of a computational biology simulation technique on emerging processing architectures. In Sixth IEEE International Workshop on High Performance Computational Biology, 2007.Google ScholarCross Ref
- J. Nickolls and I. Buck. NVIDIA CUDA software and GPU parallel computing architecture. In Microprocessor Forum, May 2007.Google Scholar
- NVIDIA Corporation. NVIDIA CUDA Programming Guide, version 1.1, 2007.Google Scholar
- L. Nyland, M. Harris, and J. Prins. Fast n-body simulation with CUDA. GPU Gems 3, pages 677--695, 2007.Google Scholar
- J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone, and J. C. Phillips. GPU computing. Proceedings of the IEEE, 96(5), May 2008.Google ScholarCross Ref
- T. J. Purcell, C. Donner, M. Cammarano, H. W. Jensen, and P. Hanrahan. Photon mapping on programmable hardware. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, pages 41--50, 2003. Google ScholarDigital Library
- S. Ryoo, C. I. Rodrigues, S. S. Stone, S. S. Baghsorkhi, D. B. Kirk, and W. W. Hwu. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, February 2008. Google ScholarDigital Library
- S. Ryoo, C. I. Rodrigues, S. S. Stone, S. S. Baghsorkhi, S. Ueng, J. A. Stratton, and W. W. Hwu. Optimization space pruning for a multithreaded GPU. In Proceedings of the 2008 International Symposium on Code Generation and Optimization, April 2008. Google ScholarDigital Library
- D. E. Shaw, M. M. Deneroff, R. O. Dror, J. S. Kuskin, R. H. Larson, J. K. Salmon, C. Young, B. Batson, K. J. Bowers, J. C. Chao, M. P. Eastwood, J. Gagliardo, J. P. Grossman, C. R. Ho, D. J. Ierardi, I. Kolossváry, J. L. Klepeis, T. Layman, C. McLeavey, M. A. Moraes, R. Mueller, E. C. Priest, Y. Shan, J. Spengler, M. Theobald, B. Towles, and S. C. Wang. Anton, a special-purpose machine for molecular dynamics simulation. In Proceedings of the 34th International Symposium on Computer Architecture, June 2007. Google ScholarDigital Library
- G. Shi and V. Kindratenko. Implementation of NAMD molecular dynamics non-bonded force-field on the Cell Broadband Engine processor. In Proceedings of the 9th International Workshop on Parallel and Distributed Scientific and Engineering Computing, April 2008.Google ScholarCross Ref
- R. D. Skeel, I. Tezcan, and D. J. Hardy. Multiple grid methods for classical molecular dynamics. Journal of Computational Chemistry, 23:673--684, 2002.Google ScholarCross Ref
- C. D. Snow, H. Nguyen, V. S. Pande, and M. Gruebele. Absolute comparison of simulated and experimental protein-folding dynamics. Nature, 420:102--106, 2002.Google ScholarCross Ref
- J. E. Stone, J. C. Phillips, P. L. Freddolino, D. J. Hardy, L. G. Trabuco, and K. Schulten. Accelerating molecular modeling applications with graphics processors. Journal of Computational Chemistry, 28(16):2618--2640, December 2007.Google ScholarCross Ref
- M. Taiji, T. Narumi, Y. Ohno, N. Futatsugi, A. Suenaga, N. Takada, and A. Konagaya. Protein explorer: A petaflops special-purpose computer system for molecular dynamics simulations. In Proceedings of the ACM/IEEE SC2003 Conference, November 2003. Google ScholarDigital Library
- D. Tarditi, S. Puri, and J. Oglesby. Accelerator: Using data parallelism to program GPUs for general-purpose uses. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 325--335, 2006. Google ScholarDigital Library
Index Terms
GPU acceleration of cutoff pair potentials for molecular modeling applications
Recommendations
A performance study of general-purpose applications on graphics processors using CUDA
Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly programmable, offering the potential for dramatic speedups for a variety of ...
GPU Acceleration of High-Speed Collision Molecular Dynamics Simulation
CIT '09: Proceedings of the 2009 Ninth IEEE International Conference on Computer and Information Technology - Volume 02we discuss an implementation and optimization of GPU-accelerated Molecular Dynamics (MD) simulation of high-speed collision molecular model in NVIDIA CUDA language. A series of optimization methods are presented: spatial decomposition, use of shared ...
GPU Acceleration for Simulating Massively Parallel Many-Core Platforms
Emerging massively parallel architectures such as a general-purpose processor plus many-core programmable accelerators are creating an increasing demand for novel methods to perform their architectural simulation. Most state-of-the-art simulation ...
Comments