|
ABSTRACT
Two important emerging trends are influencing the design, implementation and deployment of high performance parallel systems. The first is on the architectural end, where both economic and technological factors are compelling the use of off-the-shelf computing elements (workstations/PCs and networks) to put together high performance systems called clusters. The second is from the user community that is finding an increasing number of applications to benefit from such high performance systems. Apart from the scientific applications that have traditionally needed supercomputing power, a large number of graphics, visualization, database, web service and e-commerce applications have started using clusters because of their high processing and storage requirements. These applications have diverse characteristics and can place different Quality-of-Service (QoS) requirements on the underlying system (low response time, high throughput, high I/O demands, guaranteed response/throughput etc.). Further, clusters running such applications need to cater to potentially a large number of users (or other applications) in a time-shared manner. The underlying system needs to accommodate the requirements of each application, while ensuring that they do not interfere with each other.
This paper focuses on the CPU resources of a cluster and investigates scheduling mechanisms to meet the responsiveness, throughput and guaranteed service requirements of different applications. Specifically, we propose and evaluate three different scheduling mechanisms. These mechanisms have been drawn from traditional solutions on parallel systems (gang scheduling and dynamic coscheduling), and have been extended to accommodate the new criteria under consideration. These mechanisms have been investigated using detailed simulation and workload models to show their pros and cons for different performance metrics.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Mohit Aron , Peter Druschel , Willy Zwaenepoel, Cluster reserves: a mechanism for resource management in cluster-based network servers, Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.90-101, June 18-21, 2000, Santa Clara, California, United States
|
 |
2
|
Andrea C. Arpaci-Dusseau , David E. Culler , Alan M. Mainwaring, Scheduling with implicit information in distributed systems, Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.233-243, June 22-26, 1998, Madison, Wisconsin, United States
|
| |
3
|
|
| |
4
|
D. Babbar and P.Krueger. On-line hard real-time scheduling of parallel tasks on partitionable multiprocessors. In Proceedings of the 1994 International Conference on Parallel Processing, pages II: 29-38, August 1994.
|
| |
5
|
D. Bailey et al. The NAS Parallel Benchmarks. International Journal of Supercomputer Applications, 5(3):63-73, 1991.
|
 |
6
|
|
 |
7
|
Andrea C. Dusseau , Remzi H. Arpaci , David E. Culler, Effective distributed scheduling of parallel workloads, Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.25-36, May 23-26, 1996, Philadelphia, Pennsylvania, United States
|
| |
8
|
D. G. Feitelson. A Survey of Scheduling in Multiprogrammed Parallel Systems. Technical Report Research Report RC 19790(87657), IBM T. J. Watson Research Center, October 1994.
|
| |
9
|
D. G. Feitelson and L. Rudolph. Gang Scheduling Performance Benefits for Fine-Grained Synchronization. Journal of Parallel and Distributed Computing, 16(4):306-318, December 1992.
|
 |
10
|
Pawan Goyal , Xingang Guo , Harrick M. Vin, A hierarchial CPU scheduler for multimedia operating systems, Proceedings of the second USENIX symposium on Operating systems design and implementation, p.107-121, October 29-November 01, 1996, Seattle, Washington, United States
|
 |
11
|
|
| |
12
|
Microsoft TerraServer. http://www.terraserver.microsoft.com.
|
 |
13
|
Shailabh Nagar , Ajit Banerjee , Anand Sivasubramaniam , Chita R. Das, A closer look at coscheduling approaches for a network of workstations, Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures, p.96-105, June 27-30, 1999, Saint Malo, France
[doi> 10.1145/305619.305630]
|
 |
14
|
|
| |
15
|
J. K. Ousterhout. Scheduling Techniques for Concurrent Systems. In Proceedings of the 3rd International Conference on Distributed Computing Systems, pages 22-30, May 1982.
|
 |
16
|
Vivek S. Pai , Mohit Aron , Gaurov Banga , Michael Svendsen , Peter Druschel , Willy Zwaenepoel , Erich Nahum, Locality-aware request distribution in cluster-based network servers, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.205-216, October 02-07, 1998, San Jose, California, United States
|
 |
17
|
Scott Pakin , Mario Lauria , Andrew Chien, High performance messaging on workstations: Illinois fast messages (FM) for Myrinet, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.55-es, December 04-08, 1995, San Diego, California, United States
[doi> 10.1145/224170.224360]
|
| |
18
|
|
| |
19
|
|
| |
20
|
M. S. Squillante, Y. Zhang, A. Sivasubramaniam, N. Gautam, H. Franke, and J. Moreira. Analytic Modeling and Analysis of Dynamic Coscheduling for a Wide Spectrum of Parallel and Distributed Environments. Technical Report CSE-01-004, Penn State University, CSE department, February 2001.
|
| |
21
|
Thinking Machines Corporation, Cambridge, Massachusetts. The Connection Machine CM-5 Technical Summary, October 1991.
|
 |
22
|
T. von Eicken , A. Basu , V. Buch , W. Vogels, U-Net: a user-level network interface for parallel and distributed computing (includes URL), Proceedings of the fifteenth ACM symposium on Operating systems principles, p.40-53, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
23
|
C. A. Waldspurger and W. E. Weihl. Lottery Scheduling: Flexible Proportional-Share Resource Management. In Proceedings of 1st Symposium on Operating System Design and Implementation, November 1994.
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
 |
27
|
|
| |
28
|
|
| |
29
|
Y. Zhang and A. Sivasubramaniam. Scheduling Best-Effort and Real-Time Pipelined Applications on Time-Shared Clusters . Technical Report CSE-01-003, Penn State University, CSE department, February 2001.
|
 |
30
|
Yanyong Zhang , Anand Sivasubramaniam , Jose Moreira , Hubertus Franke, A simulation-based study of scheduling mechanisms for a dynamic cluster environment, Proceedings of the 14th international conference on Supercomputing, p.100-109, May 08-11, 2000, Santa Fe, New Mexico, United States
[doi> 10.1145/335231.335241]
|
Peer to Peer - Readers of this Article have also read:
-
Constructing reality
Proceedings of the 11th annual international conference on Systems documentation
Douglas A. Powell
, Norman R. Ball
, Mansel W. Griffiths
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
|