Abstract
Shared-memory multiprocessors are frequently used as compute servers with multiple parallel applications executing at the same time. In such environments, the efficiency of a parallel application can be significantly affected by the operating system scheduling policy. In this paper, we use detailed simulation studies to evaluate the performance of several different scheduling strategies, These include regular priority scheduling, coscheduling or gang scheduling, process control with processor partitioning, handoff scheduling, and affinity-based scheduling. We also explore tradeoffs between the use of busy-waiting and blocking synchronization primitives and their interactions with the scheduling strategies. Since effective use of caches is essential to achieving high performance, a key focus is on the impact of the scheduling strategies on the caching behavior of the applications.Our results show that in situations where the number of processes exceeds the number of processors, regular priority-based scheduling in conjunction with busy-waiting synchronization primitives results in extremely poor processor utilization. In such situations, use of blocking synchronization primitives can significantly improve performance. Process control and gang scheduling strategies are shown to offer the highest performance, and their performance is relatively independent of the synchronization method used. However, for applications that have sizable working sets that fit into the cache, process control performs better than gang scheduling. For the applications considered, the performance gains due to handoff scheduling and processor affinity are shown to be small.
- 1 james Archibald and Jean-Loup Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, 4(4):273- 298, November 1986. Google ScholarDigital Library
- 2 Forest Baskett, Tom Jermoluk, and Doug Solomon. The 4D-MP graphics superworkstation: Computing + graphics = 40 MIPS + 40 MFLOPS and i00,000 lighted polygons per second, in Proceedings of COMPCON '88, pages 468- 471, 1988.Google Scholar
- 3 David L. Black. Scheduling support for concurrency and parallelism in the Mach operating system. IEEE Computer, 23(5):35-43, May 1990. Google ScholarDigital Library
- 4 F. J. Carrasco. A parallel maxfiow implementation. CS411 Project Report, Stanford University, March 1988.Google Scholar
- 5 Rohit Chandra, Anoop Gupta, and John Hennessy. COOL: A Language for Parallel Programming. Research Monographs in Parallel and Distributed Computing. The MIT Press, 1990.Google Scholar
- 6 Helen Davis, Stephen Goldschrnidt, and John Hennessy. Tango: A multiprocessor simulation and tracing system. Technical Report CSL-TR-90-439, Stanford University, 1990. Google ScholarDigital Library
- 7 Edsger W. Dijkstra. The structure of the "THE"- multiprogramming system. Communications of the ACM, 11 (5):341-346, May 1968. Google ScholarDigital Library
- 8 Thomas W. Doeppner, Jr. Threads: A system for the support of concurrent programming. Technical Report CS-87- 11, Brown University, 1987.Google Scholar
- 9 Jan Edler, Jim Lipkis, and Edith Schonberg. Process management for highly parallel UNIX systems. Ultracomputer Note 136, New York University, 1988.Google Scholar
- 10 Dror G. Feitelson and Larry Rudolph. Distributed hierarchical control for parallel processing. 1EEE Computer, 23(5):65-77, May 1990. Google ScholarDigital Library
- 11 Andrew V. Goldberg and Robert E. Tarjan. A new approach to the maximum flow problem. In Proceedings of the 18th ACM Symposium on Theory of Computing, pages 136-146, 1986. Google ScholarDigital Library
- 12 Dan Lenoski et al. Design of scalable shared-memory multiprocessors: The DASH approach. In Proceedings of COMPCON '90, pages 62-67, 1990.Google Scholar
- 13 Scott T. Leutenegger and Mary K. Vemon. The performance of multiprogrammed multiprocessor scheduling policies. In Proceedings of SIGMETRICS '90, pages 226- 236, 1990. Google ScholarDigital Library
- 14 Ewing Lusk et al. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc., 1987. Google ScholarDigital Library
- 15 Shikharesh Majumdar, Derek L. Eager, and Richard B. Bunt. Scheduling in multiprogrammed parallel systems. In Proceedings of SIGMETRICS ' 88, pages 104-113, 1988. Google ScholarDigital Library
- 16 Jeffrey D. McDonald and Donald Baganoff. Vectorization of a particle simulation method for hypersonic rarifled flow. In A/AA Thermodynamics, Plasmadynamics and Lasers Conference, June 1988.Google Scholar
- 17 John K. Ousterhout. Scheduling techniques for concurrent systems. In 3rd International Conference on Distributed Computing Systems, pages 22-30, 1982.Google Scholar
- 18 Edward Rothberg and Anoop Gupta. Techniques for improving the performance of sparse matrix factorization on multiprocessor workstations. In Proceedings of Supercomputing "90, November 1990. Google ScholarDigital Library
- 19 Kenneth C. Sevcik. Characterizations of parallelism in applications and their use in scheduling. In Proceedings of SIGMETRICS '89, pages 171-180, 1989. Google ScholarDigital Library
- 20 Mark S. Squillante and Edward D. Lazowska. Using processor-cache affinity in shared-memory multiprocessor scheduling. Technical Report 89-06-01, Department of Computer Science, University of Washington, June 1989.Google Scholar
- 21 Charles P. Thacker, Lawrence C. Stewart, and Edwin H. Satterthwaite, Jr. Firefly: A multiprocessor workstation. 1EEE Transactions on Computers, 37(8):909-920, August 1988. Google ScholarDigital Library
- 22 Dominique Thiebaut and Harold S. Stone. Footprints in the cache. ACM Transactions on Computer Systems, 5(4):305- 329, November I987. Google ScholarDigital Library
- 23 Andrew Tucker and Anoop Gupta. Process control and scheduling issues for multiprogrammed shared-memory multiprocessors. In Proceedings of the 12th ACM Symposium on Operating Systems Principles, pages 159-166, 1989. Google ScholarDigital Library
- 24 john Zahorjan, Edward D. Lazowska, and Derek L. Eager. Spinning versus blocking in parallel systems with uncertainty. In Proceedings of the International Seminar on Performance of Distributed and Parallel Systems, pages 455-472, December 1988.Google Scholar
- 25 John Zahorjan, Edward D. Lazowska, and Derek L. Eager. The effect of scheduling discipline on spin overhead in shared memory parallel systems. Technical Report 89-07- 03, Department of Computer Science, University of Washington, July 1989.Google Scholar
- 26 John Zahorjan and Cathy McCann. Processor scheduling m shared memory multiprocessors. In Proceedings of S1G- METRICS ' 90, pages 214-225, 1990. Google ScholarDigital Library
Index Terms
- The impact of operating system scheduling policies and synchronization methods of performance of parallel applications
Recommendations
The impact of operating system scheduling policies and synchronization methods of performance of parallel applications
SIGMETRICS '91: Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systemsShared-memory multiprocessors are frequently used as compute servers with multiple parallel applications executing at the same time. In such environments, the efficiency of a parallel application can be significantly affected by the operating system ...
Combining performance and priority for scheduling resizable parallel applications
We illustrate and evaluate the potential impact of dynamic resizability on parallel job scheduling. Our ReSHAPE framework includes a job scheduler that supports dynamic resizing of malleable parallel applications. We propose and evaluate new scheduling ...
Comments