skip to main content
10.1145/2555243.2555270acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Portable, MPI-interoperable coarray fortran

Published:06 February 2014Publication History

ABSTRACT

The past decade has seen the advent of a number of parallel programming models such as Coarray Fortran (CAF), Unified Parallel C, X10, and Chapel. Despite the productivity gains promised by these models, most parallel scientific applications still rely on MPI as their data movement model. One reason for this trend is that it is hard for users to incrementally adopt these new programming models in existing MPI applications. Because each model use its own runtime system, they duplicate resources and are potentially error-prone. Such independent runtime systems were deemed necessary because MPI was considered insufficient in the past to play this role for these languages.

The recently released MPI-3, however, adds several new capabilities that now provide all of the functionality needed to act as a runtime, including a much more comprehensive one-sided communication framework. In this paper, we investigate how MPI-3 can form a runtime system for one example programming model, CAF, with a broader goal of enabling a single application to use both MPI and CAF with the highest level of interoperability.

References

  1. A. Petitet and R.C.Whaley and J.Dongara and A.Cleary. HPL - A Portable Implementation of the High-Performance Linpack. http://bit.ly/JtavrU, Sept. 2008.Google ScholarGoogle Scholar
  2. L. Adhianto, S. Banerjee, M. Fagan, M. Krentel, G. Marin, J. Mellor-Crummey, and N. R. Tallent. HPCTOOLKIT: tools for performance analysis of optimized parallel programs. Concurr. Comput. : Pract. Exper., 22(6):685--701, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. C. Agarwal, F. G. Gustavson, and M. Zubair. A high performance parallel algorithm for 1-D FFT. In Proceedings of the 1994 Conference on Supercomputing, pages 34--40, Los Alamitos, CA, USA, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Bonachea. Active Messages over MPI. URL http://bit.ly/14VZNOs.Google ScholarGoogle Scholar
  5. D. Bonachea. GASNet specification, v1.1. Technical Report UCB/CSD-02-1207, University of California at Berkeley, Berkeley, CA, USA, 2002. Google ScholarGoogle Scholar
  6. D. Bonachea and J. Duell. Problems with using MPI 1.1 and 2.0 as compilation targets for parallel language implementations. Int. J. High Perform. Comput. Netw., 1(1-3):91--99, Aug. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Chamberlain, D. Callahan, and H. Zima. Parallel Programmability and the Chapel Language. Intl. J. of High Performance Computing Applications, 21(3):291--312, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Dinan, P. Balaji, E. L. Lusk, P. Sadayappan, and R. Thakur. Hybrid Parallel Programming with MPI and Unified Parallel C. In 7th ACM International Conference on Computing Frontiers, Bertinoro, Italy, Apr. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Dinan, P. Balaji, J. R. Hammond, S. Krishnamoorthy, and V. Tipparaju. Supporting the Global Arrays PGAS Model Using MPI One-Sided Communication. In Proc. 26th Intl. Parallel and Distributed Processing Symp. (IPDPS), Shanghai, China, May 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Gerstenberger, M. Besta, and T. Hoefler. Enabling Highly-scalable Remote Memory Access Programming with MPI- 3 One Sided. In Proceedings of Intl. Conf. for High Perf. Computing, Networking, Storage and Analysis, SC '13, pages 53:1--53:12, New York, NY, USA, 2013. URL http://bit.ly/1dCbxe2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. HPC Challenge Benchmark. HPC challenge benchmark, July 2010. URL http://icl.cs.utk.edu/hpcc.Google ScholarGoogle Scholar
  12. G. Jin, J. Mellor-Crummey, L. Adhianto, W. N. Scherer III, and C. Yang. Implementation and Performance Evaluation of the HPC Challenge Benchmarks in Coarray Fortran 2.0. In Proceedings of the 2011 IEEE Intl. Parallel & Distributed Processing Symposium, IPDPS '11, pages 1089--1100, Washington, DC, USA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. L. Johnsson and R. L. Krawitz. Cooley-Tukey FFT on the Connection Machine. In: Parallel Computing. Volume, 18: 1201--1221, 1991.Google ScholarGoogle Scholar
  14. J. Jose, M. Luo, S. Sur, and D. K. Panda. Unifying UPC and MPI runtimes: experience with MVAPICH. In Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, page 5. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Khronos OpenCL Working Group. The OpenCL Specification, Version 2.0, July 2013. URL http://bit.ly/15tR61M.Google ScholarGoogle Scholar
  16. J. Kim, K. P. Esler, J. McMinis, M. A. Morales, B. K. Clark, L. Shulenburger, and D. M. Ceperley. Quantum energy density: Improved efficiency for quantum Monte Carlo calculations. Physical Review B, 88(3), 2013.Google ScholarGoogle Scholar
  17. E. Lusk, S. Pieper, and R. Butler. More SCALABILITY, Less PAIN. SciDAC Review, (17):30--37, 2010. URL http://bit.ly/163sZtd.Google ScholarGoogle Scholar
  18. J. Mellor-Crummey, L. Adhianto, W. N. Scherer, III, and G. Jin. A new vision for Coarray Fortran. In Proceedings of the 3rd Conf. on Partitioned Global Address Space Programing Models, PGAS '09, pages 1--9, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Message Passing Interface Forum. MPI Report 3.0, Sept. 2012. URL http://bit.ly/Ul0wY2.Google ScholarGoogle Scholar
  20. J. Nickolls, I. Buck, M. Garland, and K. Skadron. Scalable Parallel Programming with CUDA. Queue, 6(2):40--53, Mar. 2008. ISSN 1542-7730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. OpenMP Architecture Review Board. OpenMP Application Program Interface Version 4.0, July 2013. URL http://bit.ly/13LNHtI.Google ScholarGoogle Scholar
  22. S. C. Pieper and R. B. Wiringa. QUANTUM MONTE CARLO CALCULATIONS OF LIGHT NUCLEI. Annual Review of Nuclear and Particle Science, 51(1):53--90, 2001. URL http://bit.ly/143fd6u.Google ScholarGoogle ScholarCross RefCross Ref
  23. R. Preissl, N. Wichmann, B. Long, J. Shalf, S. Ethier, and A. Koniges. Multithreaded Global Address Space Communication Techniques for Gyrokinetic Fusion Applications on Ultra-Scale Platforms. In Proceedings of 2011 Int. Conf. for High Performance Computing, Networking, Storage and Analysis, SC '11, pages 78:1--78:11, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0771-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Reid. Coarrays in Fortran 2008. In Proceedings of the Third Conference on Partitioned Global Address Space Programing Models, PGAS '09, pages 4:1--4:1, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-836-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. V. A. Saraswat, B. Bloom, I. Peshansky, O. Tardieu, and D. Grove. X10 Language Specification, Version 2.2, Sept. 2011. URL http://bit.ly/1431tse.Google ScholarGoogle Scholar
  26. W. N. Scherer, III, L. Adhianto, G. Jin, J. Mellor-Crummey, and C. Yang. Hiding latency in coarray fortran 2.0. In Proceedings of the 4th Conf. on Partitioned Global Address Space Programming Model, PGAS '10, pages 14:1--14:9, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0461-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Stone, J. Dennis, and M. M. Strout. The CGPOP Miniapp, Version 1.0. Technical Report CS-11-103, Colorado State University, July 2011.Google ScholarGoogle Scholar
  28. D. Takahashi and Y. Kanada. High-performance radix-2, 3 and 5 parallel 1-D complex FFT algorithms for distributed- memory parallel computers. J. Supercomput., 15(2):207--228, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. UPC Consortium. UPC language specifications v1. 2. Technical report, Lawrence Berkeley National Laboratory, Berkeley, CA, USA, May 2005.Google ScholarGoogle Scholar
  30. T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active messages: a mechanism for integrated com- munication and computation. SIGARCH Comput. Archit. News, 20:256--266, Apr. 1992. ISSN 0163-5964. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Yang. Function shipping in a scalable parallel programming model. Master's thesis, Department of Computer Science, Rice University, Houston, Texas, 2012.Google ScholarGoogle Scholar
  32. C. Yang, K. Murthy, and J. Mellor-Crummey. Managing asynchronous operations in coarray fortran 2.0. In Proceedings of the 2013 IEEE International Symposium on Parallel Distributed Processing, pages 1321--1332, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. X. Zhao, P. Balaji, W. D. Gropp, and R. S. Thakur. MPI- Interoperable Generalized Active Messages. In IEEE International Conference on Parallel and Distributed Systems (IC- PADS), Dec. 2013.Google ScholarGoogle Scholar
  34. X. Zhao, D. Buntinas, J. A. Zounmevo, J. Dinan, D. Goodell, P. Balaji, R. Thakur, A. Afsahi, and W. Gropp. Toward Asynchronous and MPI-Interoperable Active Messages. In CCGRID'13, pages 87--94, 2013.Google ScholarGoogle Scholar

Index Terms

  1. Portable, MPI-interoperable coarray fortran

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
            February 2014
            412 pages
            ISBN:9781450326568
            DOI:10.1145/2555243

            Copyright © 2014 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 6 February 2014

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            PPoPP '14 Paper Acceptance Rate28of184submissions,15%Overall Acceptance Rate230of1,014submissions,23%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader