skip to main content
research-article
Open access

Jupiter rising: a decade of clos topologies and centralized control in Google's datacenter network

Published: 24 August 2016 Publication History

Abstract

We present our approach for overcoming the cost, operational complexity, and limited scale endemic to datacenter networks a decade ago. Three themes unify the five generations of datacenter networks detailed in this paper. First, multi-stage Clos topologies built from commodity switch silicon can support cost-effective deployment of building-scale networks. Second, much of the general, but complex, decentralized network routing and management protocols supporting arbitrary deployment scenarios were overkill for single-operator, pre-planned datacenter networks. We built a centralized control mechanism based on a global configuration pushed to all datacenter switches. Third, modular hardware design coupled with simple, robust software allowed our design to also support inter-cluster and wide-area networks. Our datacenter networks run at dozens of sites across the planet, scaling in capacity by 100x over 10 years to more than 1 Pbps of bisection bandwidth. A more detailed version of this paper is available at Ref.

References

[1]
Al-Fares, M., Loukissas, A., Vahdat, A. A scalable, commodity data center network architecture. In ACM SIGCOMM Computer Communication Review. Volume 38 (2008), ACM, 63--74.
[2]
Alizadeh, M., Greenberg, A., Maltz, D.A., Padhye, J., Patel, P., Prabhakar, B., Sengupta, S., Sridharan, M. Data center TCP (DCTCP). ACM SIGCOMM Comput. Commun. Rev. 41, 4 (2011), 63--74.
[3]
Barroso, L.A., Dean, J., Holzle, U. Web search for a planet: The Google cluster architecture. Micro. IEEE 23, 2 (2003), 22--28.
[4]
Barroso, L.A., Hölzle, U. The datacenter as a computer: An introduction to the design of warehouse-scale machines. Syn. Lect. Comput. Architect. 4, 1 (2009), 1--108.
[5]
Calder, B., Wang, J., Ogus, A., Nilakantan, N., Skjolsvold, A., McKelvie, S., Xu, Y., Srivastav, S., Wu, J., Simitci, H., et al. Windows Azure storage: A highly available cloud storage service with strong consistency. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (2011), ACM, 143--157.
[6]
Chambers, C., Raniwala, A., Perry, F., Adams, S., Henry, R.R., Bradshaw, R., Weizenbaum, N. Flumejava: Easy, efficient data-parallel pipelines. In ACM Sigplan Notices. Volume 45 (2010), ACM, 363--375.
[7]
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26, 2 (2008), 4.
[8]
Chen, Y., Griffith, R., Liu, J., Katz, R.H., Joseph, A.D. Understanding TCP incast throughput collapse in datacenter networks. In Proceedings of the 1st ACM Workshop on Research on Enterprise Networking (2009), ACM, 73--82.
[9]
Clos, C. A study of non-blocking switching networks. Bell Syst. Tech. J. 32, 2 (1953), 406--424.
[10]
Dean, J., Ghemawat, S. MapReduce: Simplified data processing on large clusters. Commun. ACM 51, 1 (2008), 107--113.
[11]
Farrington, N., Rubow, E., Vahdat, A. Data center switch architecture in the age of merchant silicon. In Proceedings of the 17th IEEE Symposium on HOT Interconnects, 2009 (2009), 93--102.
[12]
Feamster, N., Rexford, J., Zegura, E. The road to SDN: An intellectual history of programmable networks. ACM Queue 11, 12 (December 2013), 87--98.
[13]
Ghemawat, S., Gobioff, H., Leung, S.-T. The Google file system. In ACM SIGOPS Operating Systems Review. Volume 37 (2003), ACM, 29--43.
[14]
Google Cloud Platform. https://cloud.google.com.
[15]
Greenberg, A., Hamilton, J.R., Jain, N., Kandula, S., Kim, C., Lahiri, P., Maltz, D. A., Patel, P., Sengupta, S. VL2: A scalable and flexible data center network. In Proceedings of the ACM SIGCOMM Computer Communication Review (2009), 51--62.
[16]
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D. Dryad: Distributed data-parallel programs from sequential building blocks. In Proceedings of the ACM SIGOPS Operating Systems Review (2007), 59--72.
[17]
Jain, S., Kumar, A., Mandal, S., Ong, J., Poutievski, L., Singh, A., Venkata, S., Wanderer, J., Zhou, J., Zhu, M., Zolla, J., Hölzle, U., Stuart, S., Vahdat, A. B4: Experience with a globally-deployed software defined WAN. In Proceedings of the ACM SIGCOMM (2013), 3--14.
[18]
McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L., Rexford, J., Shenker, S., Turner, J. Openflow: Enabling innovation in campus networks. ACM SIGCOMM Comput. Commun. Rev. 38, 2 (2008), 69--74.
[19]
Moy, J. OSPF version 2. STD 54, RFC Editor, April 1998. http://www.rfc-editor.org/rfc/rfc2328.txt.
[20]
Prakash, P., Dixit, A.A., Hu, Y.C., Kompella, R.R. The TCP outcast problem: Exposing unfairness in data center networks. In Proceedings of the NSDI (2012), 413--426.
[21]
Singh, A., Ong, J., Agarwal, A., Anderson, G., Armistead, A., Bannon, R., Boving, S., Desai, G., Felderman, B., Germano, P., Kanagala, A., Provost, J., Simmons, J., Tanda, E., Wanderer, J., Hölzle, U., Stuart, S., Vahdat, A. Jupiter rising: A decade of clos topologies and centralized control in Google's datacenter network. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication (2015), ACM, 183--197.
[22]
Thorup, M. OSPF areas considered harmful. IETF Internet Draft 00, individual, April 2003. http://tools.ietf.org/html/draft-thorup-ospf-harmful-00.
[23]
Vahdat, A., Al-Fares, M., Farrington, N., Mysore, R.N., Porter, G., Radhakrishnan, S. Scale-out networking in the data center. IEEE MICRO 30, 4 (August 2010), 29--41.
[24]
Verma, A., Pedrosa, L., Korupolu, M., Oppenheimer, D., Tune, E., Wilkes, J. Large-scale cluster management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems (2015), ACM, 18.

Cited By

View all
  • (2024)Learning to Configure Converters in Hybrid Switching Data Center NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2023.329480332:1(520-534)Online publication date: Feb-2024
  • (2024)Optical Data Center Networking: A Comprehensive Review on Traffic, Switching, Bandwidth Allocation, and ChallengesIEEE Access10.1109/ACCESS.2024.351321412(186413-186444)Online publication date: 2024
  • (2024)Roadmap on optical communicationsJournal of Optics10.1088/2040-8986/ad261f26:9(093001)Online publication date: 17-Jul-2024
  • Show More Cited By

Index Terms

  1. Jupiter rising: a decade of clos topologies and centralized control in Google's datacenter network

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Communications of the ACM
      Communications of the ACM  Volume 59, Issue 9
      September 2016
      91 pages
      ISSN:0001-0782
      EISSN:1557-7317
      DOI:10.1145/2991470
      • Editor:
      • Moshe Y. Vardi
      Issue’s Table of Contents
      This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 August 2016
      Published in CACM Volume 59, Issue 9

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)912
      • Downloads (Last 6 weeks)145
      Reflects downloads up to 15 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Learning to Configure Converters in Hybrid Switching Data Center NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2023.329480332:1(520-534)Online publication date: Feb-2024
      • (2024)Optical Data Center Networking: A Comprehensive Review on Traffic, Switching, Bandwidth Allocation, and ChallengesIEEE Access10.1109/ACCESS.2024.351321412(186413-186444)Online publication date: 2024
      • (2024)Roadmap on optical communicationsJournal of Optics10.1088/2040-8986/ad261f26:9(093001)Online publication date: 17-Jul-2024
      • (2023)Adaptive parallel decision deep neural network for high-speed equalizationOptics Express10.1364/OE.49212731:13(22001)Online publication date: 14-Jun-2023
      • (2023)Load-optimization in Reconfigurable Data-center Networks: Algorithms and Complexity of Flow RoutingACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/35972008:3(1-30)Online publication date: 18-Jul-2023
      • (2023)Profiling Hyperscale Big Data ProcessingProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589082(1-16)Online publication date: 17-Jun-2023
      • (2023)CDPU: Co-designing Compression and Decompression Processing Units for Hyperscale SystemsProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589074(1-17)Online publication date: 17-Jun-2023
      • (2023)Analyzing the Communication Clusters in Datacenters✱Proceedings of the ACM Web Conference 202310.1145/3543507.3583410(3022-3032)Online publication date: 30-Apr-2023
      • (2023)DP-16QAM and DP-QPSK Coherent Links for 1.6Tb/s in O-band2023 Asia Communications and Photonics Conference/2023 International Photonics and Optoelectronics Meetings (ACP/POEM)10.1109/ACP/POEM59049.2023.10368683(1-3)Online publication date: 4-Nov-2023
      • (2022)Nanosecond tunable laser for the all-optical switching networkApplied Optics10.1364/AO.47563361:34(10092)Online publication date: 22-Nov-2022
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Digital Edition

      View this article in digital edition.

      Digital Edition

      Magazine Site

      View this article on the magazine site (external)

      Magazine Site

      Login options

      Full Access

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media