skip to main content
10.1145/2517349.2522712acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article
Open Access

The scalable commutativity rule: designing scalable software for multicore processors

Published:03 November 2013Publication History

ABSTRACT

What fundamental opportunities for scalability are latent in interfaces, such as system call APIs? Can scalability opportunities be identified even before any implementation exists, simply by considering interface specifications? To answer these questions this paper introduces the following rule: Whenever interface operations commute, they can be implemented in a way that scales. This rule aids developers in building more scalable software starting from interface design and carrying on through implementation, testing, and evaluation.

To help developers apply the rule, a new tool named Commuter accepts high-level interface models and generates tests of operations that commute and hence could scale. Using these tests, Commuter can evaluate the scalability of an implementation. We apply Commuter to 18 POSIX calls and use the results to guide the implementation of a new research operating system kernel called sv6. Linux scales for 68% of the 13,664 tests generated by Commuter for these calls, and Commuter finds many problems that have been observed to limit application scalability. sv6 scales for 99% of the tests.

Skip Supplemental Material Section

Supplemental Material

d1-01-austin-clements.mp4

mp4

1.2 GB

References

  1. H. Attiya, E. Hillel, and A. Milani. Inherent limitations on disjoint-access parallel implementations of transactional memory. In Proceedings of the 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures, Calgary, Canada, August 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. H. Attiya, R. Guerraoui, D. Hendler, P. Kuznetsov, M. M. Michael, and M. Vechev. Laws of order: Expensive synchronization in concurrent algorithms cannot be eliminated. In Proceedings of the 38th ACM Symposium on Principles of Programming Languages, Austin, TX, January 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The Multikernel: A new OS architecture for scalable multicore systems. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP), Big Sky, MT, October 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. F. Bellard et al. QEMU. http://www.qemu.org/.Google ScholarGoogle Scholar
  5. D. J. Bernstein. Some thoughts on security after ten years of qmail 1.0. In Proceedings of the ACM Workshop on Computer Security Architecture, Fairfax, VA, November 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. A. Bernstein and N. Goodman. Concurrency control in distributed database systems. ACM Computing Surveys, 13(2):185--221, June 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Boyd-Wickizer. Optimizing Communication Bottlenecks in Multiprocessor Operating System Kernels. PhD thesis, Massachusetts Institute of Technology, February 2014.Google ScholarGoogle Scholar
  8. S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, M. F. Kaashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y. Dai, Y. Zhang, and Z. Zhang. Corey: An operating system for many cores. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation (OSDI), San Diego, CA, December 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Boyd-Wickizer, A. Clements, Y. Mao, A. Pesterev, M. F. Kaashoek, R. Morris, and N. Zeldovich. An analysis of Linux scalability to many cores. In Proceedings of the 9th Symposium on Operating Systems Design and Implementation (OSDI), Vancouver, Canada, October 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Cadar, V. Ganesh, P. M. Pawlowski, D. L. Dill, and D. R. Engler. EXE: Automatically generating inputs of death. In Proceedings of the 13th ACM Conference on Computer and Communications Security, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Cadar, D. Dunbar, and D. Engler. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation (OSDI), San Diego, CA, December 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Cantrill and J. Bonwick. Real-world concurrency. Communications of the ACM, 51(11):34--39, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Claessen and J. Hughes. QuickCheck: A lightweight tool for random testing of Haskell programs. In Proceedings of the 5th ACM SIGPLAN International Conference on Functional Programming, Montreal, Canada, September 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. T. Clements, M. F. Kaashoek, and N. Zeldovich. Concurrent address spaces using RCU balanced trees. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), London, UK, March 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. T. Clements, M. F. Kaashoek, and N. Zeldovich. RadixVM: Scalable address spaces for multithreaded applications. In Proceedings of the ACM EuroSys Conference, Prague, Czech Republic, April 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Corbet. The search for fast, scalable counters, May 2010. http://lwn.net/Articles/170003/.Google ScholarGoogle Scholar
  17. J. Corbet. Dcache scalability and RCU-walk, April 23, 2012. http://lwn.net/Articles/419811/.Google ScholarGoogle Scholar
  18. R. Cox, M. F. Kaashoek, and R. T. Morris. Xv6, a simple Unix-like teaching operating system. http://pdos.csail.mit.edu/6.828/2012/xv6.html.Google ScholarGoogle Scholar
  19. L. de Moura and N. Bjørner. Z3: An efficient SMT solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, Budapest, Hungary, March--April 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. DWARF Debugging Information Format Committee. DWARF debugging information format, version 4, June 2010.Google ScholarGoogle Scholar
  21. F. Ellen, Y. Lev, V. Luchango, and M. Moir. SNZI: Scalable nonzero indicators. In Proceedings of the 26th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, Portland, OR, August 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, Chicago, IL, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Herlihy and E. Koskinen. Transactional boosting: A methodology for highly-concurrent transactional objects. In Proceedings of the 13th ACM Symposium on Principles and Practice of Parallel Programming, Salt Lake City, UT, February 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. P. Herlihy and J. M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages Systems, 12(3):463--492, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Howells. Extended file stat functions, Linux patch. https://lkml.org/lkml/2010/7/14/539.Google ScholarGoogle Scholar
  26. A. Israeli and L. Rappoport. Disjoint-access-parallel implementations of strong shared memory primitives. In Proceedings of the 13th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, Los Angeles, CA, August 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Koopman, A. Alimarine, J. Tretmans, and R. Plasmeijer. Gast: Generic automated software testing. In Proceedings of the 14th International Workshop on the Implementation of Functional Languages, Madrid, Spain, September 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Lameter. Effective synchronization on Linux/NUMA systems. In Gelato Conference, May 2005. http://www.lameter.com/gelato2005.pdf.Google ScholarGoogle Scholar
  29. P. E. McKenney. Differential profilng. Software: Practice and Experience, 29(3):219--234, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. E. McKenney. Concurrent code and expensive instructions. https://lwn.net/Articles/423994/, January 2011.Google ScholarGoogle Scholar
  31. P. E. McKenney, D. Sarma, A. Arcangeli, A. Kleen, O. Krieger, and R. Russell. Read-copy update. In Proceedings of the Linux Symposium, Ottawa, Canada, June 2002.Google ScholarGoogle Scholar
  32. J. M. Mellor-Crummey and M. L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Transactions on Computer Systems, 9(1):21--65, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. S. Papamarcos and J. H. Patel. A low-overhead coherence solution for multiprocessors with private cache memories. In Proceedings of the 11th Annual International Symposium on Computer Architecture, Ann Arbor, MI, June 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. P. Prabhu, S. Ghosh, Y. Zhang, N. P. Johnson, and D. I. August. Commutative set: A language extension for implicit parallel programming. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation, San Jose, CA, June 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. C. Rinard and P. C. Diniz. Commutativity analysis: A new analysis technique for parallelizing compilers. ACM Transactions on Programming Languages and Systems, 19(6):942--991, November 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Roy, S. Hand, and T. Harris. Exploring the limits of disjoint access parallelism. In Proceedings of the 1st USENIX Workshop on Hot Topics in Parallelism, Berkeley, CA, March 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. K. Sen, D. Marinov, and G. Agha. CUTE: A concolic unit testing engine for C. In Proceedings of the 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Lisbon, Portugal, September 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Shapiro, N. Preguica, C. Baquero, and M. Zawirski. Conflict-free replicated data types. In Proceedings of the 13th International Conference on Stabilization, Safety, and Security of Distributed Systems, Grenoble, France, October 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. Shapiro, N. Preguica, C. Baquero, and M. Zawirski. Convergent and commutative replicated data types. Bulletin of the EATCS, 104:67--88, June 2011.Google ScholarGoogle Scholar
  40. G. L. Steele, Jr. Making asynchronous parallelism safe for the world. In Proceedings of the 17th ACM Symposium on Principles of Programming Languages, San Francisco, CA, January 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. G. Tene, B. Iyengar, and M. Wolf. C4: The continuously concurrent compacting collector. SIGPLAN Notices, 46(11):79--88, June 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. W. E. Weihl. Commutativity-based concurrency control for abstract data types. IEEE Transactions on Computers, 37(12):1488--1505, December 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. Wentzlaff and A. Agarwal. Factored operating systems (fos): The case for a scalable operating system for multicores. ACM SIGOPS Operating System Review, 43(2):76--85, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The scalable commutativity rule: designing scalable software for multicore processors

    Recommendations

    Reviews

    Veronica Lagrange

    Connections between commutativity and concurrency are well known and well established in the computing literature. Often, commutativity is used to demonstrate the feasibility of executing instructions concurrently. Clements et al. introduce the idea of using commutativity as an interface design requirement to improve scalability. In their words, “whenever interface operations commute, they can be implemented in a way that scales.” Concurrent operations commute if the aftermath is independent of their execution order. This means that it is not possible to tell a posteriori in which order they were executed. This is achieved by conflict-free implementation, the “holy grail” of scalability. This paper demonstrates that there are some opportunities out there for more scalable interfaces. After presenting some definitions and proofs, the authors describe a tool for automatically developing test cases and testing if a particular implementation commutes. As an example, this tool is applied to a model of various POSIX file system and memory system calls to determine their commutativity. Those are compared and benchmarked against equivalent routines from an experimental operating system implemented according to the COMMUTE rules defined in the paper. Though performance tradeoffs are only briefly discussed, these preliminary results look promising. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
      November 2013
      498 pages
      ISBN:9781450323888
      DOI:10.1145/2517349

      Copyright © 2013 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 November 2013

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate131of716submissions,18%

      Upcoming Conference

      SOSP '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader