skip to main content
10.1109/SC.2004.69acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article

Analysis and Performance Results of a Molecular Modeling Application on Merrimac

Published: 06 November 2004 Publication History

Abstract

The Merrimac supercomputer uses stream processors and a high-radix network to achieve high performance at low cost and low power. The stream architecture matches the capabilities of modem semiconductor technology with compute-intensive parallel applications. We present a detailed case study of porting the GROMACS molecular-dynamics force calculation to Merrimac. The characteristics of the architecture which stress locality, parallelism, and decoupling of memory operations and computation, allow for high performance of compiler optimized code. The rich set of hardware memory operations and the ample computation bandwidth of the Merrimac processor present a wide range of algorithmic trade-offs and optimizations which may be generalized to several scientific computing domains. We use a cycle-accurate hardware simulator to analyze the performance bottlenecks of the various implementations and to measure application run-time. A comparison with the highly optimized GROMACS code, tuned for an Intel Pentium 4, confirms Merrimacýs potential to deliver high performance.

References

[1]
{1} Advanced Micro Devices, Inc., One AMD Place, P.O. Box 3453, Sunnyvale, California, USA. 3D Now! Technology Manual, Mar. 2000. Order number 21928G/0.
[2]
{2} C. Clos. A study of non-blocking switching networks. Bell System Technical Journal, 32:406-424, 1953.
[3]
{3} W. J. Dally, P. Hanrahan, M. Erez, T.J. Knight, F. Labonté J.-H. A., N. Jayasena, U. J. Kapasi, A. Das, J. Gummaraju, and I. Buck. Merrimac: Supercomputing with streams. In SC'03, Phoenix, Arizona, November 2003.
[4]
{4} P.N. Glaskowsky. Pentium 4 (partially) previewed. Microprocessor Report, August 28, 2000.
[5]
{5} P.N. Glaskowsky. IBM's PPC970 becomes Apple's G5. Microprocessor Report, July 7, 2003.
[6]
{6} GROMACS. GROMACS Single processor benchmarks. http://http://www.gromacs.org/benchmarks/single.php.
[7]
{7} B.J.H.A. Stem, F. Rittner and R.A. Friesner. Combined fluctuating charge and polarizable dipole models: Application to a five-site water potential function. J. Chem. Phys., 115:2237-2251, 2001.
[8]
{8} T. Halgren and W. Damm. Polarizable force fields. Current opinion in structural biology, 11(2): 236-242, 2001.
[9]
{9} R. Ho, K.W. Mai, and M.A. Horowitz. The future of wires. Proc. of the IEEE, 89(4): 14-25, April 2001.
[10]
{10} N. Jayasena, M. Erez, J.H. Ahn, and W.J. Dally. Stream register files with indexed access. In Proceedings of the Tenth International Symposium on High Performance Computer Architecture, Madrid, Spain, February 2004.
[11]
{11} U. J. Kapasi, W. Dally, S. Rixner, P. R. Mattson, J. D. Owens, and B. Khailany. Efficient Conditional Operations for Data-parallel Architectures. In Proceedings of the 33rd Annual IEEE/ACM International Symposium on Microarchitecture, pages 159-170, December 2000.
[12]
{12} U. J. Kapasi, P. Mattson, W. J. Dally, J. D. Owens, and B. Towles. Stream scheduling. In Proceedings of the 3rd Workshop on Media and Streaming Processors, pages 101-106, 2001.
[13]
{13} U. J. Kapasi, S. Rixner, W. J. Dally, B. Khailany, J. H. Ahn, P. Mattson, and J.D. Owens. Programmable stream processors. IEEE Computer, August 2003.
[14]
{14} B. Khailany, W.J. Dally, S. Rixner, U. Kapasi, J.D. Owen, and B. Towles. Exploring the VLSI scalability of stream processors. In Proceedings of the Ninth Symposium on High performance Computer Architecture, pages 153-164, Anaheim, California, USA, February 2003.
[15]
{15} K. Krewell. AMD serves up Opteron. Microprocessor Report, April 28, 2003.
[16]
{16} P.G. Kusalik and I.M. Svishchev. The spatial structure in liquid water. Science, 265:1219-1221, 1994.
[17]
{17} C.E. Leiserson. Fat-trees: Universal networks for hardware efficient supercomputing. IEEE Transactions on Computers, 34(10):892-901, October 1985.
[18]
{18} D.B. Loveman. Program improvement by source to soruce transformation. In Proceedings of the 3rd ACM SIGACT-SIGPLAN symposium on Principles on programming languages, pages 140-152. ACM Press, 1976.
[19]
{19} M.W. Mahoney and W.L. Jorgensen. A five-site model for liquid water and the reproduction of the density anomaly by rigid, nonpolarizable potential functions. J. Chem. Phys., 112:8910-922, 2000.
[20]
{20} P. Mattson, W.J. Dally, S. Rixner, U.J. Kapasi, and J.D. Owens. Communication scheduling. In Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, pages 82-92. ACM Press, 2000.
[21]
{21} MIPS Technologies. MIPS64 20Kc Core. http://www.mips.com/ProductCatalogPMIPS6420KcCore.
[22]
{22} Motorola, Inc. AltiVec Technology Programming Interface Manual. Motorola, Inc, 1999.
[23]
{23} H. Nada and J.P.J.M. van der Eerden. An intermolecular potential model for the simulation of ice and water near the melting point: A six-site model of H20. J. Chem. Phys., 118:7401-7413, 2003.
[24]
{24} G. W. Robinson, S.B. Zhu, S. Singh, and M. W. Evans. Water in Biology, Chemistry and Physics: Experimental Overviews and Computational Methodologies. World Scientific, Singapore, 1996.
[25]
{25} Semiconductor Industry Association. The International Technology Roadmap for Semiconductors, 2001 Edition.
[26]
{26} M. Taiji, T. Narumi, Y. Ohno, N. Futatsugi, A. Suenaga, N. Takada, and A. Konagaya. Protein explorer: A petaflops special-purpose computer system for molecular dynamics simulations. In SC'03, Phoenix, Arizona, November 2003.
[27]
{27} S.T. Thakkar and T. Huff. The Internet Streaming SIMD Extensions. Intel Technology Journal, (Q2):8, May 1999.
[28]
{28} D. van der Spoel, A.R. van Buuren, E. Apol, P.J. Meulenhoff, D.P. Tieleman, A.L.T.M. Sijbers, B. Hess, K. A. Feenstra, E. Lindahl, R. van Drunen, and H.J.C. Berendsen. Gromacs User Manual version 3.1. Nijenborgh 4,9747 AG Groningen, The Netherlands. Internet: http://www.gromacs.org, 2001.
[29]
{29} P.J. van Maaren and D. van der Spoel. Molecular dynamics of water with novel shell-model potentials. J. Phys. Chem. B, 105:2618-2626, 2001.

Cited By

View all
  • (2013)Scalability study of molecular dynamics simulation on Godson-T many-core architectureJournal of Parallel and Distributed Computing10.1016/j.jpdc.2012.07.00773:11(1469-1482)Online publication date: 1-Nov-2013
  • (2011)Performance analysis and optimization of molecular dynamics simulation on Godson-T many-core processorProceedings of the 8th ACM International Conference on Computing Frontiers10.1145/2016604.2016643(1-10)Online publication date: 3-May-2011
  • (2010)Deadlock avoidance for streaming computations with filteringProceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures10.1145/1810479.1810526(243-252)Online publication date: 13-Jun-2010
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '04: Proceedings of the 2004 ACM/IEEE conference on Supercomputing
November 2004
724 pages
ISBN:0769521533

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 06 November 2004

Check for updates

Qualifiers

  • Article

Conference

SC '04
Sponsor:

Acceptance Rates

SC '04 Paper Acceptance Rate 60 of 200 submissions, 30%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2013)Scalability study of molecular dynamics simulation on Godson-T many-core architectureJournal of Parallel and Distributed Computing10.1016/j.jpdc.2012.07.00773:11(1469-1482)Online publication date: 1-Nov-2013
  • (2011)Performance analysis and optimization of molecular dynamics simulation on Godson-T many-core processorProceedings of the 8th ACM International Conference on Computing Frontiers10.1145/2016604.2016643(1-10)Online publication date: 3-May-2011
  • (2010)Deadlock avoidance for streaming computations with filteringProceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures10.1145/1810479.1810526(243-252)Online publication date: 13-Jun-2010
  • (2007)Architecture-based optimization for mapping scientific applications to imagineProceedings of the 5th international conference on Parallel and Distributed Processing and Applications10.5555/2395970.2395978(32-43)Online publication date: 29-Aug-2007
  • (2007)Laplace transformation on the FT64 stream processorProceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture10.5555/2392163.2392170(52-62)Online publication date: 23-Aug-2007
  • (2007)Implementing and optimizing a data-intensive hydrodynamics application on the stream processorProceedings of the 2007 international conference on Computational science and its applications - Volume Part III10.5555/1793154.1793190(353-366)Online publication date: 26-Aug-2007
  • (2007)Implementation and evaluation of Jacobi iteration on the imagine stream processorProceedings of the 14th international conference on High performance computing10.5555/1782174.1782202(221-232)Online publication date: 18-Dec-2007
  • (2007)FT64Proceedings of the 14th international conference on High performance computing10.5555/1782174.1782201(209-220)Online publication date: 18-Dec-2007
  • (2007)Tradeoff between data-, instruction-, and thread-level parallelism in stream processorsProceedings of the 21st annual international conference on Supercomputing10.1145/1274971.1274991(126-137)Online publication date: 17-Jun-2007
  • (2006)The design space of data-parallel memory systemsProceedings of the 2006 ACM/IEEE conference on Supercomputing10.1145/1188455.1188540(80-es)Online publication date: 11-Nov-2006
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media