ABSTRACT
There are many different styles of parallel programming for shared-memory hardware. Each style has strengths, but can conflict with other styles. How can we use a variety of these styles in one program and minimize their conflict and maximize performance, readability, and flexibility? This paper surveys the relative advantages and disadvantages of three styles (SIMD, fork join, and message passing), shows how to compose them hierarchically, and advises how to choose what goes at each level in the hierarchy.
- Charm++. http://charm.cs.illinois.edu/research/charm.Google Scholar
- The Cilk project. http://supertech.csail.mit.edu/cilk.Google Scholar
- Erlang programming language, official website. http://ftp.sunet.se/pub/lang/erlang.Google Scholar
- Intel® Cilk++ sdk programmer's guide. http://software.intel.com/en-us/articles/download-intel-cilk-sdk.Google Scholar
- Intel® Threading Building Blocks. http://www.threadingbuildingblocks.org.Google Scholar
- Message passing interface. http://www.mpi-forum.org.Google Scholar
- NESL: A parallel programming language. http://www.cs.cmu.edu/~scandal/nesl.html.Google Scholar
- Web workers. http://www.whatwg.org/specs/web-workers/current-work.Google Scholar
- ZPL. http://www.cs.washington.edu/research/zpl/home/index.html.Google Scholar
- M. Frigo, P. Halpern, C. Leiserson, and S. Lewin-Berlin. Reducers and other Cilk++ hyperobjects. In 21st ACM Symp. on Parallelism in Algorithms and Architectures, Calgary, Canada, August 2009. Google ScholarDigital Library
- M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache oblivious algorithms. In 40th Annual Symposium on Foundations of Computer Science, New York, NY, USA, October 1999. Google ScholarDigital Library
- M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In '98 Conf. on Prog. Lang. Design and Implementation, Montreal, Quebec, Canada, June 1998. Google ScholarDigital Library
- M. Frigo and V. Strumpen. Cache oblivious stencil computations. In 19th ACM Int. Conf. on Supercomputing. Google ScholarDigital Library
- C. A. R. Hoare. Communicating Sequential Processes. June 2004. http://www.usingcsp.com/cspbook.pdf.Google Scholar
- L. Hochstein, J. Carver, F. Shull, S. Asgari, V. Basili, J. Hollingsworth, and M. Zelkowitz. Parallel programmer productivity: A case study of novice parallel programmers. In Proc. of the 2005 ACM/IEEE Conf. on Supercomputing, Seattle, WA, USA, November 2005. Google ScholarDigital Library
- T. Karunaratna. Nondeteminator-3: A provably good data-race detector that runs in parallel. Master's thesis, MIT, 2005. http://supertech.csail.mit.edu/papers/tushara-meng-thesis.pdf.Google Scholar
Index Terms
- Three layer cake for shared-memory programming
Recommendations
A Latency-Hiding MIMD Wavelet Transform
PDP '96: Proceedings of the 4th Euromicro Workshop on Parallel and Distributed Processing (PDP '96)Abstract: The discrete wavelet transform (DWT) may be used for applications in which real time execution is critical but data sizes are very large. Real-time execution can only be achieved through a parallel implementation. Published parallel ...
Performance comparison of MPI and three openMP programming styles on shared memory multiprocessors
SPAA '03: Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architecturesWhen using a shared memory multiprocessor, the programmer faces the selection of the portable programming model which will deliver the best performance. Even if he restricts his choice to the standard programming environments (MPI and OpenMP), he has a ...
Comments