ABSTRACT
MPI implementations are becoming increasingly complex and highly tunable, and thus scalability limitations can come from numerous sources. The MPI Tools Interface (MPI_T) introduced as part of the MPI 3.0 standard provides an opportunity for performance tools and external software to introspect and understand MPI runtime behavior at a deeper level to detect scalability issues. The interface also provides a mechanism to re-configure the MPI library dynamically at runtime to fine-tune performance. In this paper, we propose an infrastructure that extends existing components - TAU, MVAPICH2 and BEACON to take advantage of the MPI_T interface to offer runtime introspection, online monitoring, recommendation generation and autotuning capabilities. We validate our design by developing optimizations for a combination of production and synthetic applications. We use our infrastructure to implement an autotuning policy for AmberMD[1] that monitors and reduces MVAPICH2 library internal memory footprint by 20% without affecting performance. For applications where collective communication is latency sensitive such as MiniAMR[2], our infrastructure is able to generate recommendations to enable hardware offloading of collectives supported by MVAPICH2. By implementing this recommendation, we see a 5% improvement in application runtime.
- David A Case, Thomas E Cheatham, Tom Darden, Holger Gohlke, Ray Luo, Kenneth M Merz, Alexey Onufriev, Carlos Simmerling, Bing Wang, and Robert J Woods. The Amber biomolecular simulation programs. Journal of computational chemistry, 26(16):1668--1688, 2005. http://ambermd.org/.Google Scholar
- Michael A Heroux, Douglas W Doerfler, Paul S Crozier, James M Willenbring, H Carter Edwards, Alan Williams, Mahesh Rajan, Eric R Keiter, Heidi K Thornquist, and Robert W Numrich. Improving performance via mini-applications. Sandia National Laboratories, Tech. Rep. SAND2009-5574, 3, 2009. https://mantevo.org/.Google Scholar
- MPI Forum. MPI: A Message-Passing Interface Standard. Version 3.1, June 4th 2015. http://mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf (June. 2015).Google Scholar
- Sameer S. Shende and Allen D. Malony. The Tau Parallel Performance System. Int. J. High Perform. Comput. Appl., 20(2):287--311, May 2006. http://tau.uoregon.edu. Google ScholarDigital Library
- Jiuxing Liu, Jiesheng Wu, Sushmitha P Kini, Pete Wyckoff, and Dhabaleswar K Panda. High performance RDMA-based MPI implementation over InfiniBand. In Proceedings of the 17th annual international conference on Supercomputing, pages 295--304. ACM, 2003. Google ScholarDigital Library
- Edgar Gabriel, Graham E Fagg, George Bosilca, Thara Angskun, Jack J Dongarra, Jeffrey M Squyres, Vishal Sahay, Prabhanjan Kambadur, Brian Barrett, Andrew Lumsdaine, et al. Open MPI: Goals, concept, and design of a next generation MPI implementation. In European Parallel Virtual Machine/Message Passing Interface Users' Group Meeting, pages 97--104. Springer, 2004.Google ScholarCross Ref
- William Gropp, Ewing Lusk, Nathan Doss, and Anthony Skjellum. A high-performance, portable implementation of the MPI message passing interface standard. Parallel computing, 22(6):789--828, 1996. Google ScholarDigital Library
- Marc Pérache, Hervé Jourdren, and Raymond Namyst. MPC: A Unified Parallel Runtime for Clusters of NUMA Machines. In Proceedings of the 14th International Euro-Par Conference on Parallel Processing, Euro-Par '08, page 78--88, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarDigital Library
- Rainer Keller, George Bosilca, Graham Fagg, Michael Resch, and Jack J. Dongarra. Implementation and Usage of the PERUSE-Interface in Open MPI. In Proceedings, 13th European PVM/MPI Users' Group Meeting, Lecture Notes in Computer Science, Bonn, Germany, September 2006. Springer-Verlag. Google ScholarDigital Library
- Tanzima Islam, Kathryn Mohror, and Martin Schulz. Exploring the Capabilities of the New MPI_T Interface. In Proceedings of the 21st European MPI Users' Group Meeting, EuroMPI/ASIA '14, pages 91:91--91:96, New York, NY, USA, 2014. ACM. https://computation.llnl.gov/projects/mpi_t/gyan. Google ScholarDigital Library
- Esthela Gallardo, Jerome Vienne, Leonardo Fialho, Patricia Teller, and James Browne. MPI Advisor: A Minimal Overhead Tool for MPI Library Performance Tuning. In Proceedings of the 22Nd European MPI Users' Group Meeting, EuroMPI '15, pages 6:1--6:10, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- Esthela Gallardo, Jérôme Vienne, Leonardo Fialho, Patricia Teller, and James Browne. Employing MPI_T in MPI Advisor to optimize application performance. The International Journal of High Performance Computing Applications, 0(0):1094342016684005, 0.Google Scholar
- Jeffrey Vetter and Chris Chambreau. mpiP: Lightweight, scalable mpi profiling. 2005. http://mpip.sourceforge.net.Google Scholar
- Mohamad Chaarawi, Jeffrey M. Squyres, Edgar Gabriel, and Saber Feki. A Tool for Optimizing Runtime Parameters of Open MPI, pages 210--217. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008. https://www.open-mpi.org/projects/otpo/. Google ScholarDigital Library
- M. Gerndt and M. Ott. Automatic Performance Analysis with Periscope. Concurr. Comput.: Pract. Exper., 22(6):736--748, April 2010. http://periscope.in.tum.de/. Google ScholarDigital Library
- Anna Sikora, Eduardo César, Isaías Comprés, and Michael Gerndt. Autotuning of MPI Applications Using PTF. In Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications, SEM4HPC '16, pages 31--38, New York, NY, USA, 2016. ACM. Google ScholarDigital Library
- Simone Pellegrini, Thomas Fahringer, Herbert Jordan, and Hans Moritsch. Automatic Tuning of MPI Runtime Parameter Settings by Using Machine Learning. In Proceedings of the 7th ACM International Conference on Computing Frontiers, CF '10, pages 115--116, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- Kevin Huck, Sameer Shende, Allen Malony, Hartmut Kaiser, Allan Porterfield, Rob Fowler, and Ron Brightwell. An Early Prototype of an Autonomic Performance Environment for Exascale. In Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS '13, pages 8:1--8:8, New York, NY, USA, 2013. ACM. http://khuck.github.io/xpress-apex/. Google ScholarDigital Library
- Swann Perarnau, Rinku Gupta, Pete Beckman, et. al. Argo: An Exascale Operating System and Runtime, 2015. http://sc15.supercomputing.org/sites/all/themes/SC15images/tech_poster/poster_files/post298s2-file2.pdf.Google Scholar
- Swann Perarnau, Rajeev Thakur, Kamil Iskra, Ken Raffenetti, Franck Cappello, Rinku Gupta, Pete Beckman, Marc Snir, Henry Hoffmann, Martin Schulz, and Barry Rountree. Distributed Monitoring and Management of Exascale Systems in the Argo Project. In Proceedings of the 15th IFIP WG 6.1 International Conference on Distributed Applications and Interoperable Systems - Volume 9038, pages 173--178, NewYork, NY, USA, 2015. Springer-Verlag NewYork, Inc. Google ScholarDigital Library
- TACC Stampede cluster. The University of Texas at Austin: http://www.tacc.utexas.edu.Google Scholar
- Richard L Graham, Devendar Bureddy, Pak Lui, Hal Rosenstock, Gilad Shainer, Gil Bloch, Dror Goldenerg, Mike Dubman, Sasha Kotchubievsky, Vladimir Koushnir, et al. Scalable hierarchical aggregation protocol (SHArP): a hardware architecture for efficient data reduction. In Proceedings of the First Workshop on Optimization of Communication in HPC, pages 1--10. IEEE Press, 2016. Google ScholarDigital Library
- Andreas Knüpfer, Holger Brunst, Jens Doleschal, Matthias Jurenz, Matthias Lieber, Holger Mickler, Matthias S Müller, and Wolfgang E Nagel. The vampir performance analysis tool-set. In Tools for High Performance Computing, pages 139--155. Springer, 2008. www.vampir.eu.Google ScholarCross Ref
Index Terms
- MPI performance engineering with the MPI tool interface: the integration of MVAPICH and TAU
Recommendations
Exploring the MPI tool information interface
The latest version of the MPI Standard, MPI 3.0, includes a new interface, the MPI Tools Information Interface MPI_T, which provides tools with access to MPI internal performance and configuration information. In combination with the complementary and ...
Employing MPI_T in MPI Advisor to optimize application performance
MPI_T, the MPI Tool Information Interface, was introduced in the MPI 3.0 standard with the aim of enabling the development of more effective tools to support the Message Passing Interface MPI, a standardized and portable message-passing system that is ...
MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory
Hybrid parallel programming with the message passing interface (MPI) for internode communication in conjunction with a shared-memory programming model to manage intranode parallelism has become a dominant approach to scalable parallel programming. While ...
Comments