| Kernel-level single system image for petascale computing |
| Full text |
Pdf
(626 KB)
|
| Source
|
ACM SIGOPS Operating Systems Review
archive
Volume 40 , Issue 2 (April 2006)
table of contents
COLUMN: Operating and runtime systems for high-end computing systems
table of contents
Pages: 50 - 54
Year of Publication: 2006
ISSN:0163-5980
|
|
Authors
|
|
Hong Ong
|
Oak Ridge National Laboratory, Oak Ridge, TN
|
|
Jeffrey Vetter
|
Oak Ridge National Laboratory, Oak Ridge, TN
|
|
R. Scott Studham
|
Oak Ridge National Laboratory, Oak Ridge, TN
|
|
Collin McCurdy
|
Oak Ridge National Laboratory, Oak Ridge, TN
|
|
Bruce Walker
|
Hewlett-Packard
|
|
Alan Cox
|
Rice University, Houston, Texas
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 4, Downloads (12 Months): 65, Citation Count: 0
|
|
|
ABSTRACT
Scientific computing users typically prefer UNIX or UNIX-like operating systems as their runtime for managing software and hardware resources. These UNIX-like systems were originally designed for a single processor as well as for a broad range of programming and usage models. Although UNIX-like systems have successfully been modified to work in SMP or NUMA configuration, their internal structures remain relatively the same over the years. As we move toward the era of petascale computing, these UNIX-like systems are no longer suitable. For instance, the relative cost of supporting generic usages and system services will increase by a magnitude and thus affect the overall system performance; there are insufficient system services to globally manage parallelism, processes, and resources; users may not see the petascale system as a single powerful machine but rather as a set of multiple independent servers. A single system image (SSI) operating system is essential for efficiently manage parallelism, resources and processes as well as providing parallel processing transparency for a system possibly equipped with hundred thousand of processors. However, the success of a petascale SSI operating system goes beyond technical challenges. In particular, it must look very much like the normal UNIX, run unmodified software, scale incrementally, and equip with built-in high availability supports. This position paper focuses on these issues and discusses the development of a petascale SSI, based on an existing kernel-level SSI system, OpenSSI.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Sterling, T. and Foster, I. In Proceedings of the Petaflops, Systems Workshops, Technical Report CACR-133, California Institute of Technology, Oct. 1996.
|
| |
2
|
Fast-OS Forum to address scalable technology for runtime and operating system. http://www.fastos.org/
|
| |
3
|
OpenSSI Website: http://www.openssi.org/
|
| |
4
|
|
| |
5
|
OpenMosix Website: http://www.openmosix.org/
|
| |
6
|
C. Morin , R. Lottiaux , G. Vallee , P. Gallard , D. Margery , J.-Y. Berthou , I. D. Scherson, Kerrighed and data parallelism: cluster computing on single system image operating systems, Proceedings of the 2004 IEEE International Conference on Cluster Computing, p.277-286, September 20-23, 2004
|
| |
7
|
Yilmaz, G. and Erdogan, N. Partitioned Object Models for Distributed Abstractions, In Proceeding 14th International Symp. on Computer and Information Sciences (ISCIS XIV), Kusadasi, Turkey, 1999.
|
| |
8
|
Appavoo, J., Auslander, J., DaSilva, D., Edelsohn, D., Krieger, O., Ostrowski, M., Rosenburg, B., Wisniewski, R. W., and Xenidis, J. "K42 Overview," IBM TJ Watson Research, 2002
|
| |
9
|
Mooney, R. et al. NWPerf: A System Wide Performance Monitoring Tool, Poster Session 31, Supercomputing 2004, Pittsburg, PA.
|
 |
10
|
William J. Bolosky , Michael L. Scott , Robert P. Fitzgerald , Robert J. Fowler , Alan L. Cox, NUMA policies and their relation to memory architecture, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.212-221, April 08-11, 1991, Santa Clara, California, United States
|
 |
11
|
|
| |
12
|
|
| |
13
|
Fetrini, F., Kerbyson, D. J., and Pakin, S. The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q Performance and Architecture Laboratory (PAL). Computer and Computational Sciences (CCS) Division, Los Alamos National Laboratory, Los Alamos, New Mexico.
|
|