research-article

Helper locks for fork-join parallel programming

Authors:

Kunal Agrawal,

Charles E. Leiserson,

Jim SukhaAuthors Info & Claims

PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Pages 245 - 256

https://doi.org/10.1145/1693453.1693487

Published: 09 January 2010 Publication History

Get Access

Abstract

Helper locks allow programs with large parallel critical sections, called parallel regions, to execute more efficiently by enlisting processors that might otherwise be waiting on the helper lock to aid in the execution of the parallel region. Suppose that a processor p is executing a parallel region A after having acquired the lock L protecting A. If another processor p′ tries to acquire L, then instead of blocking and waiting for p to complete A, processor p′ joins p to help it complete A. Additional processors not blocked on L may also help to execute A.

The HELPER runtime system can execute fork-join computations augmented with helper locks and parallel regions. HELPER supports the unbounded nesting of parallel regions. We provide theoretical completion-time and space-usage bounds for a design of HELPER based on work stealing. Specifically, let V be the number of parallel regions in a computation, let T₁ be its work, and let T∞ be its "aggregate span" --- the sum of the spans (critical-path lengths) of all its parallel regions. We prove that HELPER completes the computation in expected time O(T₁/PP + T∞+ PV on P processors. This bound indicates that programs with a small number of highly parallel critical sections can attain linear speedup. For the space bound, we prove that HELPER completes a program using only O(PS₁ stack space, where S₁ is the sum, over all regions, of the stack space used by each region in a serial execution. Finally, we describe a prototype of HELPER implemented by modifying the Cilk multithreaded runtime system. We used this prototype to implement a concurrent hash table with a resize operation protected by a helper lock.

References

[1]

E. Allen, D. Chase, J. Hallett, V. Luchango, J.-W. Maessen, S. Ryu, G. L. S. Jr., and S. Tobin-Hochstadt. The Fortress language specification, version 1.0. Technical report, Sun Microsystems, Inc., March 2008. URL http://research.sun.com/projects/plrg/Publications/fortress.1.0.pdf.

Google Scholar

[2]

N. S. Arora, R. D. Blumofe, and C. G. Plaxton. Thread scheduling for multiprogrammed multiprocessors. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 119--129, Puerto Vallarta, Mexico, 1998.

Digital Library

Google Scholar

[3]

G. Barnes. A method for implementing lock-free shared-data structures. In SPAA '93: Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, pages 261--270, New York, NY, USA, 1993. ACM. ISBN 0-89791-599-2.

Digital Library

Google Scholar

[4]

R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. Journal of the ACM, 46(5):720--748, 1999.

Digital Library

Google Scholar

[5]

K. Ebcioglu, V. Saraswat, and V. Sarkar. X10: an experimental language for high productivity programming of scalable systems. In Proceedings of the Second Workshop on Productivity and Performance in High-End Computing (PPHEC-05), Feb. 2005. Held in conjunction with the Eleventh Symposium on High Performance Computer Architecture.

Google Scholar

[6]

M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 212--223, 1998.

Digital Library

Google Scholar

[7]

M. Herlihy, N. Shavit, and M. Tzafrir. Hopscotch hashing. In DISC '08: Proceedings of the 22nd International Symposium on Distributed Computing, pages 350--364, Berlin, Heidelberg, 2008. Springer-Verlag.

Digital Library

Google Scholar

[8]

A. Israeli and L. Rappoport. Disjoint-access-parallel implementations of strong shared memory primitives. In PODC '94: Proceedings of the Thirteenth Annual ACM Symposium on Principles of Distributed Computing, pages 151--160, New York, NY, USA, 1994. ACM.

Digital Library

Google Scholar

[9]

C. E. Leiserson. The Cilk++ concurrency platform. In DAC '09: Proceedings of the 46th Annual Design Automation Conference, pages 522--527, New York, NY, 2009. ACM.

Digital Library

Google Scholar

[10]

S.-C. Lim, J. Ahn, and M. H. Kim. A concurrent Blink-tree algorithm using a cooperative locking protocol. In Lecture Notes in Computer Science, volume 2712, pages 253--260. Springer Berlin / Heidelberg, 2003.

Digital Library

Google Scholar

[11]

OpenMP Architecture Review Board. OpenMP application program interface, version 3.0. http://www.openmp.org/mp-documents/spec30.pdf, May 2008.

Google Scholar

[12]

J. Reinders. Intel Threading Building Blocks: Outfitting C++ for Multi-Core Processor Parallelism. O'Reilly, 2007.

Digital Library

Google Scholar

[13]

J. Turek, D. Shasha, and S. Prakash. Locking without blocking: making lock based concurrent data structure algorithms nonblocking. In PODS '92: Proceedings of the Eleventh ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 212--222, New York, NY, USA, 1992. ACM.

Digital Library

Google Scholar

Cited By

View all

Tessler CModekurthy PFisher NSaifullah AMurphy A(2023)Co-Located Parallel Scheduling of Threads to Optimize Cache Sharing2023 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS59052.2023.00030(251-264)Online publication date: 5-Dec-2023
https://doi.org/10.1109/RTSS59052.2023.00030
Yang JHe Q(2018)Scheduling Parallel Computations by Work StealingInternational Journal of Parallel Programming10.1007/s10766-016-0484-846:2(173-197)Online publication date: 1-Apr-2018
https://dl.acm.org/doi/10.1007/s10766-016-0484-8
Chen CChoudhury VNewton R(2017)Adaptive lock-free data structures in Haskell: a general method for concurrent implementation swappingACM SIGPLAN Notices10.1145/3156695.312297352:10(197-211)Online publication date: 7-Sep-2017
https://dl.acm.org/doi/10.1145/3156695.3122973
Show More Cited By

Index Terms

Helper locks for fork-join parallel programming
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Parallel programming languages

Recommendations

Helper locks for fork-join parallel programming
PPoPP '10

Helper locks allow programs with large parallel critical sections, called parallel regions, to execute more efficiently by enlisting processors that might otherwise be waiting on the helper lock to aid in the execution of the parallel region. Suppose ...
Brief announcement: a lower bound for depth-restricted work stealing
SPAA '09: Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures

Work stealing is a common technique used in the runtime schedulers of parallel languages such as Cilk and parallel libraries such as Intel Threading Building Blocks (TBB). Depth-restricted work stealing is a restriction of Cilk-like work stealing in ...
Brief announcement: serial-parallel reciprocity in dynamic multithreaded languages
SPAA '10: Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures

In dynamically multithreaded platforms that employ work stealing, there appears to be a fundamental tradeoff between providing provably good time and space bounds and supporting SP-reciprocity, the property of allowing arbitrary calling between parallel ...

Comments

Information & Contributors

Information

Published In

PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

January 2010

372 pages

ISBN:9781605588773

DOI:10.1145/1693453

General Chairs:
R. Govindarajan
Indian Institute of Science
,
David Padua
UIUC
,
Program Chair:
Mary Hall
University of Utah

ACM SIGPLAN Notices Volume 45, Issue 5
PPoPP '10
May 2010
346 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1837853
Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 January 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

PPoPP '10

Sponsor:

SIGPLAN

PPoPP '10: ACM SIGPLAN Principles and Practice of Parallel Computing

January 9 - 14, 2010

Bangalore, India

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
495
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)2

Reflects downloads up to 22 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Tessler CModekurthy PFisher NSaifullah AMurphy A(2023)Co-Located Parallel Scheduling of Threads to Optimize Cache Sharing2023 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS59052.2023.00030(251-264)Online publication date: 5-Dec-2023
https://doi.org/10.1109/RTSS59052.2023.00030
Yang JHe Q(2018)Scheduling Parallel Computations by Work StealingInternational Journal of Parallel Programming10.1007/s10766-016-0484-846:2(173-197)Online publication date: 1-Apr-2018
https://dl.acm.org/doi/10.1007/s10766-016-0484-8
Chen CChoudhury VNewton R(2017)Adaptive lock-free data structures in Haskell: a general method for concurrent implementation swappingACM SIGPLAN Notices10.1145/3156695.312297352:10(197-211)Online publication date: 7-Sep-2017
https://dl.acm.org/doi/10.1145/3156695.3122973
Chen CChoudhury VNewton RDiatchki I(2017)Adaptive lock-free data structures in Haskell: a general method for concurrent implementation swappingProceedings of the 10th ACM SIGPLAN International Symposium on Haskell10.1145/3122955.3122973(197-211)Online publication date: 7-Sep-2017
https://dl.acm.org/doi/10.1145/3122955.3122973
Belli RHoefler T(2015)Notified AccessProceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium10.1109/IPDPS.2015.30(871-881)Online publication date: 25-May-2015
https://dl.acm.org/doi/10.1109/IPDPS.2015.30
Agrawal KFineman JLu KSheridan BSukha JUtterback RBlelloch GSanders P(2014)Provably good scheduling for parallel programs that use data structures through implicit batchingProceedings of the 26th ACM symposium on Parallelism in algorithms and architectures10.1145/2612669.2612688(84-95)Online publication date: 23-Jun-2014
https://dl.acm.org/doi/10.1145/2612669.2612688
Chen QHuang ZGuo MZhou J(2011)CABProceedings of the 2011 International Conference on Parallel Processing10.1109/ICPP.2011.32(722-732)Online publication date: 13-Sep-2011
https://dl.acm.org/doi/10.1109/ICPP.2011.32
Agrawal KLee ISukha Jauf der Heide FPhillips C(2010)Brief announcementProceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures10.1145/1810479.1810517(186-188)Online publication date: 13-Jun-2010
https://dl.acm.org/doi/10.1145/1810479.1810517
Batista EAlchieri EDotti FPedone F(2022)Early scheduling on steroids: Boosting parallel state machine replicationJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.02.001163(269-282)Online publication date: May-2022
https://doi.org/10.1016/j.jpdc.2022.02.001
Goodrich MJacob RSitchinava NMarx D(2021)Atomic power in forksProceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3458064.3458192(2141-2153)Online publication date: 10-Jan-2021
https://dl.acm.org/doi/10.5555/3458064.3458192

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations