Article

Reordering constraints for pthread-style locks

Author:

Hans-J. BoehmAuthors Info & Claims

PPoPP '07: Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming

Pages 173 - 182

https://doi.org/10.1145/1229428.1229470

Published: 14 March 2007 Publication History

Abstract

C or C++ programs relying on the pthreads interface for concurrency are required to use a specified set of functions to avoid data races, and to ensure memory visibility across threads. Although the detailed rules are not completely, it is not hard to refine them to a simple set of clear and uncontroversial rules for at least a subset of the C language that excludes structures (and hence bit-fields).

We precisely address the question of how locks in this subset must be implemented, and particularly when other memory operations can be reordered with respect to locks. This impacts the memory fences required in lock implementations, and hence has significant performance impact. Along the way, we show that a significant class of common compiler transformations are actually safe in the presence of pthreads, something which appears to have received minimal attention in the past.

We show that, surprisingly to us, the reordering constraints are not symmetric for the lock and unlock operations. In particular, it is not always safe to move memory operations into a locked region by delaying them past a pthread_mutex_lock() call, but it is safe to move them into such a region by advancing them to before a pthread_mutex_unlock() call. We believe that this was not previously recognized, and there is evidence that it is under-appreciated among implementors of thread libraries.

Although our precise arguments are expressed in terms of statement reordering within a small subset language, we believe that our results capture the situation for a full C/C++ implementation. We also argue that our results are insensitive to the details of our literal (and reasonable, though possibly unintended) interpretation of the pthread standard. We believe that they accurately reflect hardware memory ordering constraints in addition to compiler constraints. And they appear to have implications beyond pthread environments.

References

[1]

S. V. Adve. Designing Memory Consistency Models for Shared-Memory Multiprocessors. PhD thesis, University of Wisconsin-Madison, 1993.

Digital Library

[2]

S. V. Adve and K. Gharachorloo. Shared memory consistency models: A tutorial. IEEE Computer, 29(12):66--76, 1996.

Digital Library

[3]

S. V. Adve and M. D. Hill. Weak ordering---A new definition. In Proceedings of the 17th Annual International Symposium on Computer Architecture (ISCA'90), pages 2--14, 1990.

Digital Library

[4]

A. Alexandrescu, H.-J. Boehm, K. Henney, B. Hutchings, D. Lea, and B. Pugh. Memory model for multithreaded C++: Issues. C++ standards committee paper WG21/N1777, http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1777.pdf, March 2005.

[5]

A. Alexandrescu, H.-J. Boehm, K. Henney, D. Lea, and B. Pugh. Memory model for multithreaded C++. C++ standards committee paper WG21/N1680, http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2004/n1680.pdf, September 2004.

[6]

D. F. Bacon, R. Konuru, C. Murthy, and M. Serrano. Thin locks: Featherweight synchronization for Java. In Proceedings of the ACM SIGPLAN '98 Conference on Programming Language Design and Implementation, pages 258--268, 1998.

Digital Library

[7]

H. Boehm, D. Lea, and B. Pugh. Memory model for multithreaded C++: August 2005 status update. C++ standards committee paper WG21/N1876, http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1876.pdf, September 2005.

[8]

H.-J. Boehm. Fast multiprocessor memory allocation and garbage collection. Technical Report HPL-2000-165, HP Laboratories, December 2000.

[9]

H.-J. Boehm. The atomic_ops atomic operations package. http://www.hpl.hp.com/research/linux/atomic_ops/, 2005.

[10]

H.-J. Boehm. Threads cannot be implemented as a library. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 261--268, 2005.

Digital Library

[11]

G. Colvin, B. Dawes, D. Adler, and P. Dimov. The Boost shared_ptr class template. http://www.boost.org/libs/smart_ptr/shared_ptr.htm, August 2005.

[12]

K. Gharachorloo. Retrospective: memory consistency and event ordering in scalable shared-memory multiprocessors. International Conference on Computer Architecture, 25 years of the international symposia on Computer architecture (selected papers), pages 67--70, 1998.

Digital Library

[13]

IEEE and The Open Group. IEEE Standard 1003.1-2001. IEEE, 2001.

[14]

JSR 133 Expert Group. Jsr-133: Java memory model and thread specification. http://www.cs.umd.edu/~pugh/java/memoryModel/jsr133.pdf, August 2004.

[15]

A. Krishnamurthy and K. A. Yelick. Optimizing parallel programs with explicit synchronization. In SIGPLAN Conference on Programming Language Design and Implementation, pages 196--204, 1995.

Digital Library

[16]

L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers, C-28(9):690--691, 1979.

Digital Library

[17]

D. Lea. The JSR-133 cookbook for compiler writers. http://gee.cs.oswego.edu/dl/jmm/cookbook.html.

[18]

J. Lee, D. A. Padua, and S. P. Midkiff. Basic compiler alogrithem for parallel programs. In Principles and Practice of Parallel Programming, pages 1--12, 1999.

Digital Library

[19]

G. libstc++ developers. GNU standard C++ library: libstdc++-v3. http://gcc.gnu.org/viewcvs/tags/gcc_4_1_0_release/libstdc++-v3.

[20]

J. Manson, W. Pugh, and S. Adve. The Java memory model (expanded version). http://www.cs.umd.edu/users/jmanson/java/journal.pdf.

[21]

J. Manson, W. Pugh, and S. Adve. The Java memory model. In Conference Record of the Thirty-Second Annual ACM Symposium on Principles of Programming Languages, pages 378--391, January 2005.

Digital Library

[22]

P. E. McKenney. Exploiting Deferred Destruction: An Analysis of Read-Copy-Update Techniques in Operating System Kernels. PhD thesis, OGI School of Engineering at Oregon Health and Science University, 2004.

Digital Library

[23]

C. Nelson and H. Boehm. Sequencing and the concurrency memory model. C++ standards committee paper WG21/N2052, http://www.openstd.org/JTC1/SC22/WG21/docs/papers/2006/n2052.htm, September 2006.

[24]

B. Pugh. The Java memory model. http://www.cs.umd.edu/~pugh/java/memoryModel/.

[25]

M. L. Scott and W. N. Scherer, III. Scalable queue-based spin locks with timeout. In Principles and Practice of Parallel Programming (PPOPP), pages 44--52, 2001.

Digital Library

[26]

D. Shasha and M. Snir. Efficient and correct execution of parallel programs that share memory. ACM Transactions on Programming Languages and Systems, 10(2):282--312, April 1998.

Digital Library

[27]

The Open Group and IEEE. The single UNIX specification, version 3 (IEEE standard 1003.1-2001). http://unix.org/version3/, see "Base Definitions", 4.10.

Cited By

Huang HRao JWu SJin HJiang HChe HWu XLaure EMarkidis SVerbanescu ALofstead G(2021)Towards Exploiting CPU Elasticity via Efficient Thread OversubscriptionProceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3431379.3460641(215-226)Online publication date: 21-Jun-2021
https://dl.acm.org/doi/10.1145/3431379.3460641
Cataldo RFernandes RMartin KSilveira JSanchez GSepulveda JMarcon CDiguet J(2021)Subutai: Speeding Up Legacy Parallel Applications Through Data SynchronizationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.304006632:5(1102-1116)Online publication date: 1-May-2021
https://doi.org/10.1109/TPDS.2020.3040066
Hartung MSchintke FSchutt T(2019)Pinpoint Data Races via Testing and Classification2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)10.1109/ISSREW.2019.00100(386-393)Online publication date: Oct-2019
https://doi.org/10.1109/ISSREW.2019.00100
Show More Cited By

Index Terms

Reordering constraints for pthread-style locks
1. Software and its engineering
  1. Software notations and tools
    1. Compilers
    2. General programming languages
      1. Language features
        Concurrent programming structures

Recommendations

Threads cannot be implemented as a library
PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation

In many environments, multi-threaded code is written in a language that was originally designed without thread support (e.g. C), to which a library of threading primitives was subsequently added. There appears to be a general understanding that this is ...
Threads cannot be implemented as a library
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation

In many environments, multi-threaded code is written in a language that was originally designed without thread support (e.g. C), to which a library of threading primitives was subsequently added. There appears to be a general understanding that this is ...
Protecting Locks Against Unbalanced Unlock()
SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures

The lock is a building-block synchronization primitive that enables mutually exclusive access to shared data in shared-memory parallel programs. Mutual exclusion is typically achieved by guarding the code that accesses the shared data with a pair of lock(...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PPoPP '07: Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming

March 2007

284 pages

ISBN:9781595936028

DOI:10.1145/1229428

General Chair:
Katherine Yelick
UC Berkeley and Lawrence Berkeley National Lab., USA
,
Program Chair:
John Mellor-Crummey
Rice University, USA

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 March 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

PPoPP07

Sponsor:

PPoPP07: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

March 14 - 17, 2007

California, San Jose, USA

Acceptance Rates

PPoPP '07 Paper Acceptance Rate 22 of 65 submissions, 34%;

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
835
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Huang HRao JWu SJin HJiang HChe HWu XLaure EMarkidis SVerbanescu ALofstead G(2021)Towards Exploiting CPU Elasticity via Efficient Thread OversubscriptionProceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3431379.3460641(215-226)Online publication date: 21-Jun-2021
https://dl.acm.org/doi/10.1145/3431379.3460641
Cataldo RFernandes RMartin KSilveira JSanchez GSepulveda JMarcon CDiguet J(2021)Subutai: Speeding Up Legacy Parallel Applications Through Data SynchronizationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.304006632:5(1102-1116)Online publication date: 1-May-2021
https://doi.org/10.1109/TPDS.2020.3040066
Hartung MSchintke FSchutt T(2019)Pinpoint Data Races via Testing and Classification2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)10.1109/ISSREW.2019.00100(386-393)Online publication date: Oct-2019
https://doi.org/10.1109/ISSREW.2019.00100
Poetzl DKroening D(2016)Formalizing and Checking Thread Refinement for Data-Race-Free Execution ModelsProceedings of the 22nd International Conference on Tools and Algorithms for the Construction and Analysis of Systems - Volume 963610.1007/978-3-662-49674-9_30(515-530)Online publication date: 2-Apr-2016
https://dl.acm.org/doi/10.1007/978-3-662-49674-9_30
Kasikci BZamfir CCandea G(2015)Automated Classification of Data Races Under Both Strong and Weak Memory ModelsACM Transactions on Programming Languages and Systems10.1145/273411837:3(1-44)Online publication date: 22-May-2015
https://dl.acm.org/doi/10.1145/2734118
Boehm HZhang LMutlu O(2012)Can seqlocks get along with programming language memory models?Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness10.1145/2247684.2247688(12-20)Online publication date: 16-Jun-2012
https://dl.acm.org/doi/10.1145/2247684.2247688
Alglave JMaranget L(2011)Stability in weak memory modelsProceedings of the 23rd international conference on Computer aided verification10.5555/2032305.2032311(50-66)Online publication date: 14-Jul-2011
https://dl.acm.org/doi/10.5555/2032305.2032311
Yu LYeung STang CTerzopoulos DChan TOsher S(2011)Make it homeACM Transactions on Graphics10.1145/2010324.196498130:4(1-12)Online publication date: 25-Jul-2011
https://dl.acm.org/doi/10.1145/2010324.1964981
Lau MOhgawara AMitani JIgarashi T(2011)Converting 3D furniture models to fabricatable parts and connectorsACM Transactions on Graphics10.1145/2010324.196498030:4(1-6)Online publication date: 25-Jul-2011
https://dl.acm.org/doi/10.1145/2010324.1964980
Boehm HVetter JMusuvathi MShen X(2011)Performance implications of fence-based memory modelsProceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness10.1145/1988915.1988919(13-19)Online publication date: 5-Jun-2011
https://dl.acm.org/doi/10.1145/1988915.1988919
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten