research-article

Open Access

The scalable commutativity rule: designing scalable software for multicore processors

Authors:
Austin T. Clements

MIT CSAIL

MIT CSAIL
View Profile

,
M. Frans Kaashoek

MIT CSAIL

MIT CSAIL
View Profile

,
Nickolai Zeldovich

MIT CSAIL

MIT CSAIL
View Profile

,
Robert T. Morris

MIT CSAIL

MIT CSAIL
View Profile

,
Eddie Kohler

Harvard University

Harvard University
View Profile

SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems PrinciplesNovember 2013Pages 1–17https://doi.org/10.1145/2517349.2522712

Published:03 November 2013Publication History

SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

Pages 1–17

ABSTRACT

What fundamental opportunities for scalability are latent in interfaces, such as system call APIs? Can scalability opportunities be identified even before any implementation exists, simply by considering interface specifications? To answer these questions this paper introduces the following rule: Whenever interface operations commute, they can be implemented in a way that scales. This rule aids developers in building more scalable software starting from interface design and carrying on through implementation, testing, and evaluation.

To help developers apply the rule, a new tool named Commuter accepts high-level interface models and generates tests of operations that commute and hence could scale. Using these tests, Commuter can evaluate the scalability of an implementation. We apply Commuter to 18 POSIX calls and use the results to guide the implementation of a new research operating system kernel called sv6. Linux scales for 68% of the 13,664 tests generated by Commuter for these calls, and Commuter finds many problems that have been observed to limit application scalability. sv6 scales for 99% of the tests.

Supplemental Material

d1-01-austin-clements.mp4

mp4

1.2 GB

Download

References

H. Attiya, E. Hillel, and A. Milani. Inherent limitations on disjoint-access parallel implementations of transactional memory. In Proceedings of the 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures, Calgary, Canada, August 2009. Google ScholarDigital Library
H. Attiya, R. Guerraoui, D. Hendler, P. Kuznetsov, M. M. Michael, and M. Vechev. Laws of order: Expensive synchronization in concurrent algorithms cannot be eliminated. In Proceedings of the 38th ACM Symposium on Principles of Programming Languages, Austin, TX, January 2011. Google ScholarDigital Library
A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The Multikernel: A new OS architecture for scalable multicore systems. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP), Big Sky, MT, October 2009. Google ScholarDigital Library
F. Bellard et al. QEMU. http://www.qemu.org/.Google Scholar
D. J. Bernstein. Some thoughts on security after ten years of qmail 1.0. In Proceedings of the ACM Workshop on Computer Security Architecture, Fairfax, VA, November 2007. Google ScholarDigital Library
P. A. Bernstein and N. Goodman. Concurrency control in distributed database systems. ACM Computing Surveys, 13(2):185--221, June 1981. Google ScholarDigital Library
S. Boyd-Wickizer. Optimizing Communication Bottlenecks in Multiprocessor Operating System Kernels. PhD thesis, Massachusetts Institute of Technology, February 2014.Google Scholar
S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, M. F. Kaashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y. Dai, Y. Zhang, and Z. Zhang. Corey: An operating system for many cores. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation (OSDI), San Diego, CA, December 2008. Google ScholarDigital Library
S. Boyd-Wickizer, A. Clements, Y. Mao, A. Pesterev, M. F. Kaashoek, R. Morris, and N. Zeldovich. An analysis of Linux scalability to many cores. In Proceedings of the 9th Symposium on Operating Systems Design and Implementation (OSDI), Vancouver, Canada, October 2010. Google ScholarDigital Library
C. Cadar, V. Ganesh, P. M. Pawlowski, D. L. Dill, and D. R. Engler. EXE: Automatically generating inputs of death. In Proceedings of the 13th ACM Conference on Computer and Communications Security, 2006. Google ScholarDigital Library
C. Cadar, D. Dunbar, and D. Engler. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation (OSDI), San Diego, CA, December 2008. Google ScholarDigital Library
B. Cantrill and J. Bonwick. Real-world concurrency. Communications of the ACM, 51(11):34--39, 2008. Google ScholarDigital Library
K. Claessen and J. Hughes. QuickCheck: A lightweight tool for random testing of Haskell programs. In Proceedings of the 5th ACM SIGPLAN International Conference on Functional Programming, Montreal, Canada, September 2000. Google ScholarDigital Library
A. T. Clements, M. F. Kaashoek, and N. Zeldovich. Concurrent address spaces using RCU balanced trees. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), London, UK, March 2012. Google ScholarDigital Library
A. T. Clements, M. F. Kaashoek, and N. Zeldovich. RadixVM: Scalable address spaces for multithreaded applications. In Proceedings of the ACM EuroSys Conference, Prague, Czech Republic, April 2013. Google ScholarDigital Library
J. Corbet. The search for fast, scalable counters, May 2010. http://lwn.net/Articles/170003/.Google Scholar
J. Corbet. Dcache scalability and RCU-walk, April 23, 2012. http://lwn.net/Articles/419811/.Google Scholar
R. Cox, M. F. Kaashoek, and R. T. Morris. Xv6, a simple Unix-like teaching operating system. http://pdos.csail.mit.edu/6.828/2012/xv6.html.Google Scholar
L. de Moura and N. Bjørner. Z3: An efficient SMT solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, Budapest, Hungary, March--April 2008. Google ScholarDigital Library
DWARF Debugging Information Format Committee. DWARF debugging information format, version 4, June 2010.Google Scholar
F. Ellen, Y. Lev, V. Luchango, and M. Moir. SNZI: Scalable nonzero indicators. In Proceedings of the 26th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, Portland, OR, August 2007. Google ScholarDigital Library
P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, Chicago, IL, June 2005. Google ScholarDigital Library
M. Herlihy and E. Koskinen. Transactional boosting: A methodology for highly-concurrent transactional objects. In Proceedings of the 13th ACM Symposium on Principles and Practice of Parallel Programming, Salt Lake City, UT, February 2008. Google ScholarDigital Library
M. P. Herlihy and J. M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages Systems, 12(3):463--492, 1990. Google ScholarDigital Library
D. Howells. Extended file stat functions, Linux patch. https://lkml.org/lkml/2010/7/14/539.Google Scholar
A. Israeli and L. Rappoport. Disjoint-access-parallel implementations of strong shared memory primitives. In Proceedings of the 13th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, Los Angeles, CA, August 1994. Google ScholarDigital Library
P. Koopman, A. Alimarine, J. Tretmans, and R. Plasmeijer. Gast: Generic automated software testing. In Proceedings of the 14th International Workshop on the Implementation of Functional Languages, Madrid, Spain, September 2002. Google ScholarDigital Library
C. Lameter. Effective synchronization on Linux/NUMA systems. In Gelato Conference, May 2005. http://www.lameter.com/gelato2005.pdf.Google Scholar
P. E. McKenney. Differential profilng. Software: Practice and Experience, 29(3):219--234, 1999. Google ScholarDigital Library
P. E. McKenney. Concurrent code and expensive instructions. https://lwn.net/Articles/423994/, January 2011.Google Scholar
P. E. McKenney, D. Sarma, A. Arcangeli, A. Kleen, O. Krieger, and R. Russell. Read-copy update. In Proceedings of the Linux Symposium, Ottawa, Canada, June 2002.Google Scholar
J. M. Mellor-Crummey and M. L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Transactions on Computer Systems, 9(1):21--65, 1991. Google ScholarDigital Library
M. S. Papamarcos and J. H. Patel. A low-overhead coherence solution for multiprocessors with private cache memories. In Proceedings of the 11th Annual International Symposium on Computer Architecture, Ann Arbor, MI, June 1984. Google ScholarDigital Library
P. Prabhu, S. Ghosh, Y. Zhang, N. P. Johnson, and D. I. August. Commutative set: A language extension for implicit parallel programming. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation, San Jose, CA, June 2011. Google ScholarDigital Library
M. C. Rinard and P. C. Diniz. Commutativity analysis: A new analysis technique for parallelizing compilers. ACM Transactions on Programming Languages and Systems, 19(6):942--991, November 1997. Google ScholarDigital Library
A. Roy, S. Hand, and T. Harris. Exploring the limits of disjoint access parallelism. In Proceedings of the 1st USENIX Workshop on Hot Topics in Parallelism, Berkeley, CA, March 2009. Google ScholarDigital Library
K. Sen, D. Marinov, and G. Agha. CUTE: A concolic unit testing engine for C. In Proceedings of the 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Lisbon, Portugal, September 2005. Google ScholarDigital Library
M. Shapiro, N. Preguica, C. Baquero, and M. Zawirski. Conflict-free replicated data types. In Proceedings of the 13th International Conference on Stabilization, Safety, and Security of Distributed Systems, Grenoble, France, October 2011. Google ScholarDigital Library
M. Shapiro, N. Preguica, C. Baquero, and M. Zawirski. Convergent and commutative replicated data types. Bulletin of the EATCS, 104:67--88, June 2011.Google Scholar
G. L. Steele, Jr. Making asynchronous parallelism safe for the world. In Proceedings of the 17th ACM Symposium on Principles of Programming Languages, San Francisco, CA, January 1990. Google ScholarDigital Library
G. Tene, B. Iyengar, and M. Wolf. C4: The continuously concurrent compacting collector. SIGPLAN Notices, 46(11):79--88, June 2011. Google ScholarDigital Library
W. E. Weihl. Commutativity-based concurrency control for abstract data types. IEEE Transactions on Computers, 37(12):1488--1505, December 1988. Google ScholarDigital Library
D. Wentzlaff and A. Agarwal. Factored operating systems (fos): The case for a scalable operating system for multicores. ACM SIGOPS Operating System Review, 43(2):76--85, 2009. Google ScholarDigital Library

Index Terms

The scalable commutativity rule: designing scalable software for multicore processors
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
        Concurrency control

Recommendations

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors

What opportunities for multicore scalability are latent in software interfaces, such as system call APIs? Can scalability challenges and opportunities be identified even before any implementation exists, simply by considering interface specifications? ...
Read More
The scalable commutativity rule: designing scalable software for multicore processors

Developing software that scales on multicore processors is an inexact science dominated by guesswork, measurement, and expensive cycles of redesign and reimplementation. Current approaches are workload-driven and, hence, can reveal scalability ...
Read More
Highly scalable and robust rule learner: performance evaluation and comparison

Business intelligence and bioinformatics applications increasingly require the mining of datasets consisting of millions of data points, or crafting real-time enterprise-level decision support systems for large corporations and drug companies. In all ...
Read More

Reviews

Reviewer: Veronica Lagrange

Connections between commutativity and concurrency are well known and well established in the computing literature. Often, commutativity is used to demonstrate the feasibility of executing instructions concurrently. Clements et al. introduce the idea of using commutativity as an interface design requirement to improve scalability. In their words, “whenever interface operations commute, they can be implemented in a way that scales.” Concurrent operations commute if the aftermath is independent of their execution order. This means that it is not possible to tell a posteriori in which order they were executed. This is achieved by conflict-free implementation, the “holy grail” of scalability. This paper demonstrates that there are some opportunities out there for more scalable interfaces. After presenting some definitions and proofs, the authors describe a tool for automatically developing test cases and testing if a particular implementation commutes. As an example, this tool is applied to a model of various POSIX file system and memory system calls to determine their commutativity. Those are compared and benchmarked against equivalent routines from an experimental operating system implemented according to the COMMUTE rules defined in the paper. Though performance tradeoffs are only briefly discussed, these preliminary results look promising. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
November 2013
498 pages
ISBN:9781450323888
DOI:10.1145/2517349
General Chair:
Michael Kaminsky
Intel Labs
,
Program Chair:
Mike Dahlin
Google and UT Austin
Copyright © 2013 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2013
Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate131of716submissions,18%
Upcoming Conference
SOSP '24

Sponsor:

sigops

ACM SIGOPS 29th Symposium on Operating Systems Principles

November 5 - 8, 2024

Austin , TX , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 108
  Total Citations
  View Citations
- 9,231
  Total Downloads
- Downloads (Last 12 months)186
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The scalable commutativity rule: designing scalable software for multicore processors

SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors

The scalable commutativity rule: designing scalable software for multicore processors

Highly scalable and robust rule learner: performance evaluation and comparison

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

The scalable commutativity rule: designing scalable software for multicore processors

SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors

The scalable commutativity rule: designing scalable software for multicore processors

Highly scalable and robust rule learner: performance evaluation and comparison

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media