skip to main content
10.1145/1250662.1250679acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article

Flattened butterfly: a cost-efficient topology for high-radix networks

Published:09 June 2007Publication History

ABSTRACT

Increasing integrated-circuit pin bandwidth has motivateda corresponding increase in the degree or radix of interconnection networksand their routers. This paper introduces the flattened butterfly, a cost-efficient topology for high-radix networks. On benign (load-balanced) traffic, the flattened butterfly approaches the cost/performance of a butterfly network and has roughly half the cost of a comparable performance Clos network.The advantage over the Clos is achieved by eliminating redundant hopswhen they are not needed for load balance. On adversarial traffic, the flattened butterfly matches the cost/performance of a folded-Clos network and provides an order of magnitude better performance than a conventional butterfly.In this case, global adaptive routing is used to switchthe flattened butterfly from minimal to non-minimal routing - usingredundant hops only when they are needed. Minimal and non-minimal, oblivious and adaptive routing algorithms are evaluated on the flattened butterfly.We show that load-balancing adversarial traffic requires non-minimalglobally-adaptive routing and show that sequential allocators are required to avoid transient load imbalance when using adaptive routing algorithms.We also compare the cost of the flattened butterfly to folded-Clos, hypercube,and butterfly networks with identical capacityand show that the flattened butterfly is more cost-efficient thanfolded-Clos and hypercube topologies.

References

  1. Amphenol. http://www.amphenol.com/.Google ScholarGoogle Scholar
  2. L. N. Bhuyan and D. P. Agrawal. Generalized hypercube and hyperbus structures for a computer network. IEEE Trans. Computers, 33(4):323--333, 1984.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K.-Y. K. Chang et al. A 0.4-4-Gb/s CMOS Quad Transceiver Cell Using On-Chip Regulated Dual-Loop PLLs. IEEE Journal of Solid--State Circuits, 38(5):747--754, 2003.Google ScholarGoogle Scholar
  4. C. Clos. A Study of Non-Blocking Switching Networks. The Bell System technical Journal, 32(2):406--424, March 1953.Google ScholarGoogle Scholar
  5. Cray XT3. http://www.cray.com/products/systems/xt3/.Google ScholarGoogle Scholar
  6. W. J. Dally. Performance Analysis of k-ary n-cube Interconnection Networks. IEEE Transactions on Computers, 39(6):775--785, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. J. Dally. Virtual-channel Flow Control. IEEE Transactions on Parallel and Distributed Systems, 3(2):194--205, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. J. Dally, P. P. Carvey, and L. R. Dennison. The Avici Terabit Switch/Router. In Proc. of Hot Interconnects, pages 41--50, August 1998.Google ScholarGoogle Scholar
  9. W. J. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann, San Francisco, CA, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. G. Dietz and T.I.Mattox. Compiler techniques for flat neighborhood networks. In 13th International Workshop on Languages and Compilers for Parallel Computing, pages 420--431, Yorktown Heights, New York, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Gore. http://www.gore.com/electronics.Google ScholarGoogle Scholar
  12. E. J. Kim et al. Energy optimization techniques in cluster interconnects. In Proc. of the 2003 international symposium on Low power electronics and design, pages 459--464, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Kim, W. J. Dally, and D. Abts. Adaptive Routing in High-radix Clos Network. In International Conference for High Performance Computing, Networking, Storage, and Analysis (SC'06), Tampa, FL, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Kim, W. J. Dally, B. Towles, and A. K. Gupta. Microarchitecture of a high-radix router. In Proc. of the International Symposium on Computer Architecture (ISCA), pages 420--431, Madison, WI, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. P. Kruskal and M. Snir. The performance of multistage interconnection networks for multiprocessors. IEEE Trans. Computers, 32(12):1091--1098, 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. E. Ladner and M. J. Fischer. Parallel prefix computation. J. ACM, 27(4):831--838, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Laudon and D. Lenoski. The SGI Origin: A ccNUMA Highly Scalable Server. In Proc. of the 24th Annual Int'l Symp. on Computer Architecture, pages 241--251, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Leiserson. Fat-trees: Universal networks for hardware efficient supercomputing. IEEE Transactions on Computer, C34(10):892--901, October 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Microprocessor Report. http://www.mdronline.com/.Google ScholarGoogle Scholar
  20. C. S. Patel, S. M. Chai, S. Yalamanchili, and D. E. Schimmel. Power constrained design of multiprocessor interconnection networks. In International Conference on Computer Design (ICCD), pages 408--416, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Pautsch. Thermal Challenges in the Next Generation of Supercomputers. CoolCon, 2005.Google ScholarGoogle Scholar
  22. G. Pfister. An Introduction to the InfiniBand Arechitecture (http://www.infinibandta.org). IEEE Press, 2001.Google ScholarGoogle Scholar
  23. S. Scott, D. Abts, J. Kim, and W. J. Dally. The BlackWidow High-radix Clos Network. In Proc. of the International Symposium on Computer Architecture (ISCA), Boston, MA, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Scott and G. Thorson. The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus. In Hot Chips 4, Stanford, CA, Aug. 1996.Google ScholarGoogle Scholar
  25. L. Shang, L.-S. Peh, and N. K. Jha. Dynamic voltage scaling with links for power optimization of interconnection networks. In International Symposium on High-Performance Computer Architecture (HPCA), page 91, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. H. J. Siegel. A model of simd machines and a comparison of various interconnection networks. IEEE Trans. Computers, 28(12):907--917, 1979.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Singh. Load-Balanced Routing in Interconnection Networks. PhD thesis, Stanford University, 2005.Google ScholarGoogle Scholar
  28. A. Singh, W. J. Dally, A. K. Gupta, and B. Towles. GOAL: A load-balanced adaptive routing algorithm for torus networks. In Proc. of the International Symposium on Computer Architecture (ISCA), pages 194--205, San Diego, CA, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. V. Soteriou and L.-S. Peh. Design-space exploration of power-aware on/off interconnection networks. In International Conference on Computer Design (ICCD), pages 510--517, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. L. G. Valiant. A scheme for fast parallel communication. SIAM Journal on Computing, 11(2):350--361, 1982.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. H. Wang, L. S. Peh, and S. Malik. Power-driven Design of Router Microarchitectures in On-chip Networks. In Proc. of the 36th Annual IEEE/ACM Int'l Symposium on Microarchitecture, pages 105--116, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. K.-L. J. Wong, H. Hatamkhani, M. Mansuri, and C.-K. K. Yang. A 27-mW 3.6-Gb/s I/O Transceiver. IEEE Journal ofGoogle ScholarGoogle Scholar
  33. S. Young and S. Yalamanchili. Adaptive routing in generalized hypercube architectures. In Proc. of the IEEE Symposium on Parallel and Distributed Processing, pages 564--571, Dallas, TX, Dec. 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Flattened butterfly: a cost-efficient topology for high-radix networks

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ISCA '07: Proceedings of the 34th annual international symposium on Computer architecture
        June 2007
        542 pages
        ISBN:9781595937063
        DOI:10.1145/1250662
        • General Chair:
        • Dean Tullsen,
        • Program Chair:
        • Brad Calder
        • cover image ACM SIGARCH Computer Architecture News
          ACM SIGARCH Computer Architecture News  Volume 35, Issue 2
          May 2007
          527 pages
          ISSN:0163-5964
          DOI:10.1145/1273440
          Issue’s Table of Contents

        Copyright © 2007 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 9 June 2007

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate543of3,203submissions,17%

        Upcoming Conference

        ISCA '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader