ACM Home Page
Please provide us with feedback. Feedback
Integrating non-blocking synchronisation in parallel applications: performance advantages and methodologies
Full text pdf formatPdf (516 KB)
Source Workshop on Software and Performance archive
Proceedings of the 3rd international workshop on Software and performance table of contents
Rome, Italy
SESSION: Performance modeling and analysis table of contents
Pages: 55 - 67  
Year of Publication: 2002
ISBN:1-58113-563-7
Authors
Philippas Tsigas  Chalmers University of Technology, Sweden
Yi Zhang  Chalmers University of Technology, Sweden
Sponsors
SIGSOFT: ACM Special Interest Group on Software Engineering
SIGMETRICS: ACM Special Interest Group on Measurement and Evaluation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 21,   Citation Count: 2
Additional Information:

abstract   references   cited by   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/584369.584378
What is a DOI?

ABSTRACT

In this paper we investigate how performance and speedup of applications would be affected by using non-blocking rather than blocking synchronisation in parallel systems. The results obtained show that for many applications, non-blocking synchronisation lead to significant speedups for a fairly large number of processors, while it never slows the applications down. As part of this investigation this paper also provides a set of efficient and simple translations that show how typical blocking operations found in parallel applications, such as simple locks, queues and lock trees can be translated into non-blocking equivalents that use hardware primitives common in modern multiprocessor systems. With these translations this paper clearly demonstrates that it is easy for the application designer/programmer to replace the blocking operations commonly found on with non-blocking equivalents ones. For the empirical results a set of representative applications running on a large-scale ccNUMA machine were used.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Eichenberger and S. Abraham, Impact of Load Imbalance on the Design of Software Barriers, in Proceedings of the 1995 International Conference on Parallel Processing, pp. 63-72, August 1995.
 
2
M. Galles, Scalable Pipelined Interconnect for Distributed Endpoint Routing: The SGI Spider Chip, in Proceedings of Hot Interconnects IV, pp. 141-146, 1996.
3
 
4
A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph and M. Snir, The NYU Ultracomputer --- Designing a MIMD Shared-Memory Parallel Machine", IEEE Trans. on Computers, 32(2), p. 175, February 1983.
5
6
7
8
9
 
10
11
12
13
 
14
15
16
 
17
D. R. O'Hallaron, Spark98: Sparse Matrix Kernels for Shared Memory and Message Passing Systems, Technical Report CMU-CS-97-178, October 1997.
18
 
19
SGI, SGI TechPubs Library, http://techpubs.sgi.com/, 2000.
 
20
21
22
23
24
25
 
26
 
27

Collaborative Colleagues:
Philippas Tsigas: colleagues
Yi Zhang: colleagues

Peer to Peer - Readers of this Article have also read: