skip to main content
research-article

Project Kittyhawk: building a global-scale computer: Blue Gene/P as a generic computing platform

Published: 01 January 2008 Publication History

Abstract

This paper describes Project Kittyhawk, an undertaking at IBM Research to explore the construction of a next-generation platform capable of hosting many simultaneous web-scale workloads. We hypothesize that for a large class of web-scale workloads the Blue Gene/P platform is an order of magnitude more efficient to purchase and operate than the commodity clusters in use today. Driven by scientific computing demands the Blue Gene designers pursued an aggressive system-on-a-chip methodology that led to a scalable platform composed of air-cooled racks. Each rack contains more than a thousand independent computers with highspeed interconnects inside and between racks.
We postulate that the same demands of efficiency and density apply to web-scale platforms. This project aims to develop the system software to enable Blue Gene/P as a generic platform capable of being used by heterogeneous workloads. We describe our firmware and operating system work to provide Blue Gene/P with generic system software, one of the results of which is the ability to run thousands of heterogeneous Linux instances connected by TCP/IP networks over the high-speed internal interconnects.

References

[1]
Appavoo, J., Silva, D. D., Krieger, O., Auslander, M., Ostrowski, M., Rosenburg, B., Waterland, A., Wisniewski, R. W., Xenidis, J., Stumm, M., and Soares, L. Experience distributing objects in an SMMP OS. ACM Transactions on Computer Systems (TOCS) 25, 3 (2007), 6.
[2]
Carter, J. B., Khandekar, D., and Kamb, L. Distributed shared memory: Where we are and where we should be headed. In Fifth Workshop on Hot Topics in Operating Systems (HotOS-V) (1995).
[3]
Coraid. EtherDrive Storage. http://coraid.com.
[4]
Daly, D., Choi, J. H., Moreira, J. E., and Waterland, A. Base operating system provisioning and bringup for a commercial supercomputer. In International Parallel and Distributed Processing Symposium (IPDPS) (2007), IEEE.
[5]
Denx Software Engineering. Das U-Boot -- the Universal Boot Loader. http://www.denx.de/wiki/UBoot.
[6]
Dunkels, A. lwIP -- A Lightweight TCP/IP stack. http://www.sics.se/~adam/lwip/.
[7]
Dunkels, A. Full TCP/IP for 8-bit architectures. In The International Conference on Mobile Systems, Applications, and Services (MobiSys) (San Francisco, CA, May 2003), USENIX.
[8]
Fan, X., Weber, W.-D., and Barroso, L. A. Power provisioning for a warehouse-sized computer. In Proceedings of the 34th annual international symposium on Computer architecture (ISCA '07) (New York, NY, USA, 2007), ACM Press.
[9]
Goldberg, R. P. Survey of virtual machine research. IEEE Computer Magazine 7, 6 (1974).
[10]
Grbic, A., Brown, S., Caranci, S., Grindley, G., Gusat, M., Lemieux, G., Loveless, K., Manjikian, N., Srbljic, S., Stumm, M., Vranesic, Z., and Zilic, Z. Design and implementation of the NUMAchine multiprocessor. In Proceedings of the 1998 Conference on Design Automation (DAC-98) (Los Alamitos, CA, June 15--19 1998), ACM/IEEE.
[11]
IBM. Exploiting the Dual Floating Point Units in Blue Gene/L. White Paper 7007511, IBM, http://www-1.ibm.com/support/docview.wss?uid=swg27007511, June 2006.
[12]
IEEE. 1149.1-1990 IEEE Standard Test Access Port and Boundary-Scan Architecture-Description. IEEE, New York, NY, USA, 1990.
[13]
Jain, N., Amini, L., Andrade, H., King, R., Park, Y., Selo, P., and Venkatramani, C. Design, implementation, and evaluation of the linear road benchmark on the stream processing core. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data (SIGMOD '06) (New York, NY, USA, 2006), ACM Press.
[14]
Li, K. IVY: A shared virtual memory system for parallel computing. Proceedings of the 1988 International Conference on Parallel Processing, Vol. II Software (Aug. 1988).
[15]
Liedtke, J. On μ-kernel construction. In Proceedings of the 15th ACM Symposium on Operating System Principles (SOSP '95) (Copper Mountain Resort, CO, Dec. 1995).
[16]
Seshadri, A., Luk, M., Shi, E., Perrig, A., van Doorn, L., and Khosla, P. Pioneer: Verifying integrity and guaranteeing execution of code on legacy platforms. In Proceedings of the 20th ACM Symposium on Operating System Principles (SOSP '07) (Brighton, UK, Oct. 2005), ACM.
[17]
The Standard Performance Evaluation Corporation (SPEC). SPECjbb2005 Java Server Benchmark. http://www.spec.org/jbb2005.
[18]
Vise, D., and Malseed, M. The Google Story: Inside the Hottest Business, Media, and Technology Success of Our Time. Delta, Aug. 2006.

Cited By

View all
  • (2016)HILProceedings of the Seventh ACM Symposium on Cloud Computing10.1145/2987550.2987588(155-168)Online publication date: 5-Oct-2016
  • (2015)Achieving Performance Isolation with Lightweight Co-KernelsProceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing10.1145/2749246.2749273(149-160)Online publication date: 15-Jun-2015
  • (2013)CamCubeOSProceedings of the 22nd international symposium on High-performance parallel and distributed computing10.1145/2493123.2462917(73-84)Online publication date: 17-Jun-2013
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review  Volume 42, Issue 1
January 2008
133 pages
ISSN:0163-5980
DOI:10.1145/1341312
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2008
Published in SIGOPS Volume 42, Issue 1

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2016)HILProceedings of the Seventh ACM Symposium on Cloud Computing10.1145/2987550.2987588(155-168)Online publication date: 5-Oct-2016
  • (2015)Achieving Performance Isolation with Lightweight Co-KernelsProceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing10.1145/2749246.2749273(149-160)Online publication date: 15-Jun-2015
  • (2013)CamCubeOSProceedings of the 22nd international symposium on High-performance parallel and distributed computing10.1145/2493123.2462917(73-84)Online publication date: 17-Jun-2013
  • (2013)CamCubeOSProceedings of the 22nd international symposium on High-performance parallel and distributed computing10.1145/2462902.2462917(73-84)Online publication date: 17-Jun-2013
  • (2011)Minimal-overhead virtualization of a large scale supercomputerACM SIGPLAN Notices10.1145/2007477.195270546:7(169-180)Online publication date: 9-Mar-2011
  • (2011)Minimal-overhead virtualization of a large scale supercomputerProceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments10.1145/1952682.1952705(169-180)Online publication date: 9-Mar-2011
  • (2011)An approach for virtual appliance distribution for service deploymentFuture Generation Computer Systems10.1016/j.future.2010.09.00927:3(280-289)Online publication date: 1-Mar-2011
  • (2011)Research on the Architecture of Cloud ComputingApplied Informatics and Communication10.1007/978-3-642-23235-0_66(519-526)Online publication date: 2011
  • (2010)Understanding the Cloud Computing LandscapeCloud Computing and Software Services10.1201/EBK1439803158-c1(1-16)Online publication date: 13-Jul-2010
  • (2010)Providing a cloud network infrastructure on a supercomputerProceedings of the 19th ACM International Symposium on High Performance Distributed Computing10.1145/1851476.1851534(385-394)Online publication date: 21-Jun-2010
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media