skip to main content
10.1145/1629911.1630048acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

The Cilk++ concurrency platform

Published: 26 July 2009 Publication History

Abstract

The availability of multicore processors across a wide range of computing platforms has created a strong demand for software frameworks that can harness these resources. This paper overviews the Cilk++ programming environment, which incorporates a compiler, a runtime system, and a race-detection tool. The Cilk++ runtime system guarantees to load-balance computations effectively. To cope with legacy codes containing global variables, Cilk++ provides a "hyperobject" library which allows races on nonlocal variables to be mitigated without lock contention or substantial code restructuring.

References

[1]
Gene Amdahl. The validity of the single processor approach to achieving large-scale computing capabilities. In Proceedings of the AFIPS Spring Joint Computer Conference, pages 483--485, April 1967.
[2]
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, and Charles E. Leiserson. On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs. In Proceedings of the Sixteenth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA 2004), pages 133--144, Barcelona, Spain, June 2004.
[3]
Robert D. Blumofe and Charles E. Leiserson. Space-efficient scheduling of multithreaded computations. In Proceedings of the Twenty Fifth Annual ACM Symposium on Theory of Computing, pages 362--371, San Diego, California, May 1993.
[4]
Robert D. Blumofe and Charles E. Leiserson. Scheduling multithreaded computations by work stealing. Journal of the ACM, 46(5):720--748, September 1999.
[5]
Derek Bruening. Efficient, Transparent, and Comprehensive Runtime Code Manipulation. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 2004.
[6]
Guang-Ien Cheng, Mingdong Feng, Charles E. Leiserson, Keith H. Randall, and Andrew F. Stark. Detecting data races in cilk programs that use locks. In Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA '98), pages 298--309, Puerto Vallarta, Mexico, June 28--July 2 1998.
[7]
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, third edition, 2009.
[8]
Anne Dinning and Edith Schonberg. An empirical comparison of monitoring algorithms for access anomaly detection. In Proceedings of the Second ACM SIGPLAN Symposium on Principles&Practice of Parallel Programming (PPoPP), pages 1--10. ACM Press, 1990.
[9]
Anne Dinning and Edith Schonberg. Detecting access anomalies in programs with critical sections. In Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging, pages 85--96. ACM Press, May 1991.
[10]
Perry A. Emrath, Sanjoy Ghosh, and David A. Padua. Event synchronization analysis for debugging parallel programs. In Supercomputing '91, pages 580--588, November 1991.
[11]
Mingdong Feng and Charles E. Leiserson. Efficient detection of determinacy races in Cilk programs. In Proceedings of the Ninth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 1--11, Newport, Rhode Island, June22--25 1997.
[12]
Yaacov Fenster. Detecting parallel access anomalies. Master's thesis, Hebrew University, March 1998.
[13]
Matteo Frigo, Pablo Halpern, Charles E. Leiserson, and Stephen Lewin-Berlin. Reducers and other Cilk++ hyperobjects. In Proceedings of the Twenty-First Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA '09), Calgary, Canada, August 2009. To appear.
[14]
Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. The implementation of the Cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN '98 Conference on Programming Language Design and Implementation, pages 212--223, Montreal, Quebec, Canada, June 1998. Proceedings published ACM SIGPLAN Notices, Vol. 33, No. 5, May, 1998.
[15]
Michael R. Garey and David S. Johnson. Computers and Intractability. W. H. Freeman and Company, 1979.
[16]
David P. Helmbold, Charles E. McDowell, and Jian-Zhong Wang. Analyzing traces with anonymous synchronization. In Proceedings of the 1990 International Conference on Parallel Processing, pages II70--II77, August 1990.
[17]
Institute of Electrical and Electronic Engineers. Information technology --- Portable Operating System Interface (POSIX) --- Part 1: System application program interface (API) {C language}. IEEE Standard 1003.1, 1996 Edition.
[18]
Brian W. Kernighan and Dennis M. Ritchie. The C Programming Language. Prentice Hall, Inc., second edition, 1988.
[19]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI '05: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 190--200, New York, NY, USA, 2005. ACM Press.
[20]
John Mellor-Crummey. On-the-fly detection of data races for programs with nested fork-join parallelism. In Proceedings of Super-computing'91, pages 24--33. IEEE Computer Society Press, 1991.
[21]
Barton P. Miller and Jong-Deok Choi. A mechanism for efficient debugging of parallel programs. In Proceedings of the 1988 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 135--144, Atlanta, Georgia, June 1988.
[22]
Sang Lyul Min and Jong-Deok Choi. An efficient cache-based access anomaly detection scheme. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 235--244, Palo Alto, California, April 1991.
[23]
The MPI Forum. MPI: A message passing interface. In Supercomputing '93, pages 878--883, Portland, Oregon, November 1993.
[24]
The MPI Forum. MPI-2: Extensions to the Message-Passing Interface. Technical Report, University of Tennessee, Knoxville, 1996. Available from: citeseer.ist.psu.edu/517818.html.
[25]
Robert H. B. Netzer and Sanjoy Ghosh. Efficient race condition detection for shared-memory programs with post/wait synchronization. In Proceedings of the 1992 International Conference on Parallel Processing, St. Charles, Illinois, August 1992.
[26]
Robert H. B. Netzer and Barton P. Miller. What are race conditions? ACM Letters on Programming Languages and Systems, 1(1):74--88, March 1992.
[27]
Itzhak Nudler and Larry Rudolph. Tools for the efficient development of efficient parallel programs. In Proceedings of the First Israeli Conference on Computer Systems Engineering, May 1986.
[28]
Dejan Perković and Peter Keleher. Online data-race detection via coherency guarantees. In Proceedings of the Second USENIX Symposium on Operating Systems Design and Implementation (OSDI), Seattle, Washington, October 1996.
[29]
Stefan Savage, Michael Burrows, Greg Nelson, Patric Sobalvarro, and Thomas Anderson. Eraser: A dynamic race detector for multithreaded programs. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles (SOSP), October 1997.
[30]
Bjarne Stroustrup. The C++ Programming Language. Addison-Wesley, third edition, 2000.
[31]
Bjarne Stroustrup. C++ in 2005. Addison-Wesley, 2005. Preface to the Japanese translation.
[32]
Supercomputing Technologies Group, Massachusetts Institute of Technology Laboratory for Computer Science. Cilk 5.4.2.3 Reference Manual, April 2006. Available from: http://supertech.csail.mit.edu/cilk/home/software.html.
[33]
William Wulf and Mary Shaw. Global variable considered harmful. SIGPLAN Notices, 8(2):28--34, 1973.

Cited By

View all
  • (2024)LSGraph: A Locality-centric High-performance Streaming Graph EngineProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650076(33-49)Online publication date: 22-Apr-2024
  • (2023)Beyond Static Parallel Loops: Supporting Dynamic Task Parallelism on Manycore Architectures with Software-Managed Scratchpad MemoriesProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582020(46-58)Online publication date: 25-Mar-2023
  • (2023)Efficient Synchronization-Light Work StealingProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591099(39-49)Online publication date: 17-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '09: Proceedings of the 46th Annual Design Automation Conference
July 2009
994 pages
ISBN:9781605584973
DOI:10.1145/1629911
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Amdahl's Law
  2. dag model
  3. hyperobject
  4. multicore programming
  5. multithreading
  6. parallel programming
  7. parallelism
  8. race detection
  9. reducer
  10. span
  11. speedup
  12. work

Qualifiers

  • Research-article

Funding Sources

Conference

DAC '09
Sponsor:
DAC '09: The 46th Annual Design Automation Conference 2009
July 26 - 31, 2009
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)1
Reflects downloads up to 22 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)LSGraph: A Locality-centric High-performance Streaming Graph EngineProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650076(33-49)Online publication date: 22-Apr-2024
  • (2023)Beyond Static Parallel Loops: Supporting Dynamic Task Parallelism on Manycore Architectures with Software-Managed Scratchpad MemoriesProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582020(46-58)Online publication date: 25-Mar-2023
  • (2023)Efficient Synchronization-Light Work StealingProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591099(39-49)Online publication date: 17-Jun-2023
  • (2022)Taskflow: A Lightweight Parallel and Heterogeneous Task Graph Computing SystemIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.310425533:6(1303-1320)Online publication date: 1-Jun-2022
  • (2022)Extending an asynchronous runtime system for high throughput applicationsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.01.027163:C(214-231)Online publication date: 1-May-2022
  • (2022)OpenMP as runtime for providing high-level stream parallelism on multi-coresThe Journal of Supercomputing10.1007/s11227-021-04182-978:6(7655-7676)Online publication date: 3-Jan-2022
  • (2021)Quantifying the Semantic Gap Between Serial and Parallel Programming2021 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC53511.2021.00024(151-162)Online publication date: Nov-2021
  • (2021)PEPS: predictive energy-efficient parallel scheduler for multi-core processorsThe Journal of Supercomputing10.1007/s11227-020-03562-x77:7(6566-6585)Online publication date: 2-Jan-2021
  • (2020)On the fly MHP analysisProceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3332466.3374541(173-186)Online publication date: 19-Feb-2020
  • (2020)Analyzing the Performance Trade-Off in Implementing User-Level ThreadsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.297605731:8(1859-1877)Online publication date: 1-Aug-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media