ACM Home Page
Please provide us with feedback. Feedback
On using connection-oriented vs. connection-less transport for performance and scalability of collective and one-sided operations: trade-offs and impact
Full text PdfPdf (378 KB)
Source
Principles and Practice of Parallel Programming archive
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming table of contents
San Jose, California, USA
SESSION: Communication table of contents
Pages: 46 - 54  
Year of Publication: 2007
ISBN:978-1-59593-602-8
Authors
Amith R. Mamidala  The Ohio State University, Columbus, OH
Sundeep Narravula  The Ohio State University, Columbus, OH
Abhinav Vishnu  The Ohio State University, Columbus, OH
Gopal Santhanaraman  The Ohio State University, Columbus, OH
Dhabaleswar K. Panda  The Ohio State University, Columbus, OH
Sponsors
ACM: Association for Computing Machinery
SIGPLAN: ACM Special Interest Group on Programming Languages
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 68,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1229428.1229437
What is a DOI?

ABSTRACT

Communication subsystem plays a pivotal role in achieving scalable performance in clusters. The communication semantics employed are dictated by the programming model used by the application such as MPI, UPC, etc. Out of the gamut of communication primitives, collective and one-sided operations are especially significant and have to be designed harnessing the capabilities and features exposed by the underlying networks. In some cases, there is a direct match between the semantics of the operations and the underlying network primitives. InfiniBand provides two transport modes: (i)Connection-oriented Reliable connection (RC) supporting Memory and Channel semantics and (ii) Connection-less Unreliable Datagram (UD) supporting Channel semantics. Achieving good performance and scalability needs careful analysis and design of communication primitives based on these options.

In this paper, we evaluate the scalability and performance trade-offs between RC and UD transport modes. We study the semantic advantages of mapping collective and one-sided operations on to memory and channel semantics of InfiniBand(IBA). We take AlltoAll as a case study to demonstrate the benefits of RDMA over Send/Recv and to show the performance/memory trade-offs over IB transports. Our experimental results show that UD-based AlltoAll performs 38% better than Bruck's algorithm for short messages and up to two times better than the direct AlltoAll over RC. Since InfiniBand does not provide RDMA over UD in hardware, we emulate the same in our study. Our results show a performance dip of up to a factor of three for emulated RDMA Read latency as compared to RC, highlighting the need for hardware based RDMA operations over UD.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Rajeev Thakur and William Gropp Improving the Performance of Collective Operations in MPICH In Euro PVM/MPI Conference, 2003.
 
2
 
3
 
4
Network-Based Computing Laboratory MVAPICH: MPI for InfiniBand http://nowlab.cse.ohio-state.edu/projects/mpi-iba/index.html
 
5
Abhinav Vishnu and Gopal K. Santhanaraman and Wei Huang and Hyun-Wook Jin and Dhabaleswar K. Panda Supporting MPI-2 One Sided Communication on Multi-Rail InfiniBand Clusters: Design Challenges and Performance Benefits In International Conference on High Performance Computing, HiPC 2005.
 
6
UC Berkeley/LBNL Berkeley UPC - Unified Parallel C, http://upc.lbl.gov/
 
7
Christian Bell and Dan Bonachea and Rajesh Nishtala and Katherine Yelick Optimizing Bandwidth Limited Problems Using One-Sided Communication and Overlap Support In IEEE International Parallel & Distributed Processing Symposium, IPDPS 2006.
 
8
 
9
InfiniBand Trade Association InfiniBand Architecture Specification, Release 1.1 November,2002, http://www.infinibandta.org
 
10
 
11
William Gropp and Ewing Lusk and Nathan Doss and Anthony Skjellum A High-Performance, Portable Implementation of the MPI, Message Passing Interface Standard Argonne National Laboratory and Mississippi State University, 1996.
 
12
Rajeev Thakur, Rolf Rabenseifner, and William Gropp. Optimization of Collective Communication Operations in MPICH. In Int'l Journal of High Performance Computing Applications, (19)1:49--66, Spring 2005.
 
13
Weihang Jiang, Jiuxing Liu, Hyun-Wook Jin, Dhabaleswar K. Panda, Darius Buntinas, Rajeev Thakur, and William Gropp. Efficient Implementation of MPI-2 Passive One-Sided Communication on InfiniBand Clusters. In Proceedings of the 11th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2004), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3241, Springer September 2004, pp. 68--76.
 
14
Rajeev Thakur, William Gropp, and Brian Toonen. Minimizing Synchronization Overhead in the Implementation of MPI One-Sided Communication. In Proceedings of the 11th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2004), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3241, Springer, September 2004, pp. 57--67.
 
15
 
16
Amith R. Mamidala, Abhinav Vishnu and Dhabaleswar K. Panda. Efficient Shared Memory and RDMA based design for MPI_Allgather over InfiniBand. In Proceedings of EuroPVM/MPI, September 2006.
 
17
 
18
19


Collaborative Colleagues:
Amith R. Mamidala: colleagues
Sundeep Narravula: colleagues
Abhinav Vishnu: colleagues
Gopal Santhanaraman: colleagues
Dhabaleswar K. Panda: colleagues