Abstract
Threads are the vehicle for concurrency in many approaches to parallel programming. Threads separate the notion of a sequential execution stream from the other aspects of traditional UNIX-like processes, such as address spaces and I/O descriptors. The objective of this separation is to make the expression and control of parallelism sufficiently cheap that the programmer or compiler can exploit even fine-grained parallelism with acceptable overhead.Threads can be supported either by the operating system kernel or by user-level library code in the application address space, but neither approach has been fully satisfactory. This paper addresses this dilemma. First, we argue that the performance of kernel threads is inherently worse than that of user-level threads, rather than this being an artifact of existing implementations; we thus argue that managing parallelism at the user level is essential to high-performance parallel computing. Next, we argue that the lack of system integration exhibited by user-level threads is a consequence of the lack of kernel support for user-level threads provided by contemporary multiprocessor operating systems; we thus argue that kernel threads or processes, as currently conceived, are the wrong abstraction on which to support user-level management of parallelism. Finally, we describe the design, implementation, and performance of a new kernel interface and user-level thread package that together provide the same functionality as kernel threads without compromising the performance and flexibility advantages of user-level management of parallelism.
- Agha 86 Agha, G. Actors: A Model o/Concurrent Computation in Distributed Systems. MIT Press, 1986. Google ScholarDigital Library
- Anderson et al. 89 Anderson, T., Lazowska, E., and Levy, H. The Performance Implications of Thread Management Alternatives for Shared Memory Multiprocessors. iEEE Transactions on Computers, 38(t2):1631-1644, December 1989. Also appeared in Proceedings o/the 1989 A CM SIC- METRICS and Per/ormance '89 Conference on Measurement and Modeling of Computer Systems, May 1989. Google ScholarDigital Library
- Barnes & Hut 86 Barnes, J. and Hut, P. A Hierarchical O(N log N) Force-Calculation Algorithm. Nature, 324:446-449, 1986.Google Scholar
- Birrell et al. 87 Birrell, A., Guttag, J., Horning, J., and Levin, R. Synchronization Primitives for a Multiprocessor: A Formal Specification. In Proceedings of the l lth A CM Symposium on Operating Systems Principles, pages 94-102, November 1987. Google ScholarDigital Library
- Black 90 Black, D. Scheduling Support for Concurrency and Parallelism in the Mach Operating System. IEEE Computer Magazine, 23(5):35-43, May 1990. Google ScholarDigital Library
- Chase et al. 89 Chase, J., Amador, F., Lazowska, E., Levy, H., and Littlefield, R. The Amber System: Par- Mlel Programming on a Network of Multiprocessots. In Proceedings o/the l ~th A CM Symposium on Operating Systems Principles, pages 147-58, December 1989. Google ScholarDigital Library
- Cheriton 88 Cheriton, D. The V Distributed System. Communications o/ the ACM, 31(3):314-333, March 1988. Google ScholarDigital Library
- Draves & Cooper 88 Draves, R. and Cooper, E. C Threads. Technical Report CMU-CS-88-154, School of Computer Science, Carnegie-Mellon University, June 1988.Google Scholar
- Edler et al. 88 Edler, J., Lipkis, J., and Schonberg, E. Process Management for Highly Parallel UNIX Systems. In Proceedings o/the USENIX Workshop on UNIX and Supercomputers, pages 1-17, September 1988.Google Scholar
- Halstead 85 Halstead, R. Multilisp: A Language for Concurrent Symbolic Computation. A CM Transactions on Programming Languages and Systems, 7(4):501-538, October 1985. Google ScholarDigital Library
- Herlihy 90 Herlihy, M. A Methodology for Implementing Highly Concurrent Data Structures. in Proceedings o/the ~nd ACM SiGPLAN Symposium on Principles and Practice o/Parallel Programming, pages 197-206, March 1990. Google ScholarDigital Library
- Karlin et al. 91 Karlin, A., Li, K., Manasse, M., and Owicki, S. Empirical Studies of Competitive Spinning on A Shared-Memory Multiprocessor. In Proceedings of the 13th A CM Symposium on Operating Systems Principles, October 1991. Google ScholarDigital Library
- Lampson & Redell 80 Lampson, B. and Redell, D. Experiences with Processes and Monitors in Mesa. Communications o/ the ACM, 23(2):104-117, February 1980. Google ScholarDigital Library
- Lo & Gligor 87 Lo, S.-P. and Gligor, V. A Comparative Analysis of Multiprocessor Scheduling Algorithms. In Proceedings of the 7th International Conference on Distributed Computing Systems, pages 356-363, September 1987.Google Scholar
- Marsh et al. 91 Marsh, B., Scott, M., LeBlanc, T., and Markatos, E. First-Class User-Level Threads. In Proceedings o/the 13th A CM Symposium on Operating Systems Principles, October 1991. Google ScholarDigital Library
- Moeller-Nielsen & Staunstrup 87 Moeller-Nielsen, P. and Staunstrup, J. Problem-Heap: A Paradigm for Multiprocessor Algorithms. Parallel Computing, 4(1):63-74, February 1987.Google Scholar
- Moss & Kohler 87 Moss, :i. and Kohler, W. Concurrency Features for the Trellis/Owl Language. In Proceedings of European Conference on Object- Oriented Programming 1987 (ECOOP 87), pages 171-180, June 1987. Google ScholarDigital Library
- Redell 88 Redell, D. The Topaz Tele-Debugger. In Proceedings of the A CM SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging, May 1988.Google Scholar
- Schroeder & Burrows 90 Schroeder, M. and Burrows, M. Performance of Firefly RPC. ACM Transactions on Computer Systems, 8(1):1-17, February 1990. Google ScholarDigital Library
- Scott et al. 90 Scott, M., LeBlanc, T., and Marsh, B. Multi-Model Parallel Programming in Psyche. In Proceedings of the ~nd A CM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 70-78, March 1990. Google ScholarDigital Library
- Tevanian et al. 87 Tevanian, A., Rashid, R., Golub, D., Black, D., Cooper, E., and Young, M. Mach Threads and the Unix Kernel: The Battle for Control. in Proceedings of the 1987 USENIX Summer Conference, pages 185-197, 1987.Google Scholar
- Thacker et al. 88 Thacker, C., Stewart, L., and Satterthwaite, Jr., E. Firefly: A Multiprocessor Workstation. IEEE Transactions on Computers, 37(8):909-920, August 1988. Google ScholarDigital Library
- Tucker & Gupta 89 Tucker, A. and Gupta, A. Process Control and Scheduling Issues for Multiprogrammed Shared Memory Multiprocessors. In Proceedings of the 12th A CM Symposium on (pp. erating Systems Principles, pages 159-166, December 1989. Google ScholarDigital Library
- Vandevoorde & Roberts 88 Vandevoorde, M. and Roberts, E. WorkCrews: An Abstraction for Controlling Parallelism. International Journal o} Parallel Pro#raining, 17(4):347-366, August 1988. Google ScholarDigital Library
- Weiser et al. 89 Weiser, M., Demers, A., and Hauser, C. The Portable Common Runtime Approach to Interoperability. In Proceedings of the l~th A CM Symposium on Operating Systems Principles, pages 114-122, December 1989. Google ScholarDigital Library
- Wulf et al. 81 Wulf, W., Levin, R., and Harbison, S. Hydra/C, mmp: An Experimental Computer System. McGraw-Hill, 1981.Google Scholar
- Zahorjan & McCann 90 gahorjan, J. and McCann, C. Processor Scheduling in Shared Memory Multiprocessors. In Proceedings of the 1990 A CM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pages 214-225, May 1990. Google ScholarDigital Library
- Zahorjan et al. 91 gahorjan, J., Lazowska, E., and Eager, D. The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Multiprocessors. IEEE Transactions on Parallel and Distributed Systems, 2(2):180-198, April 1991. Google ScholarDigital Library
Index Terms
- Scheduler activations: effective kernel support for the user-level management of parallelism
Recommendations
Scheduler activations: effective kernel support for the user-level management of parallelism
SOSP '91: Proceedings of the thirteenth ACM symposium on Operating systems principlesThreads are the vehicle for concurrency in many approaches to parallel programming. Threads separate the notion of a sequential execution stream from the other aspects of traditional UNIX-like processes, such as address spaces and I/O descriptors. The ...
Scheduler activations: effective kernel support for the user-level management of parallelism
Threads are the vehicle for concurrency in many approaches to parallel programming. Threads can be supported either by the operating system kernel or by user-level library code in the application address space, but neither approach has been fully ...
A work-stealing scheduler for X10's task parallelism with suspension
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel ProgrammingThe X10 programming language is intended to ease the programming of scalable concurrent and distributed applications. X10 augments a familiar imperative object-oriented programming model with constructs to support light-weight asynchronous tasks as well ...
Comments