skip to main content
10.1145/2038916.2038922acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

ALIAS: scalable, decentralized label assignment for data centers

Published:26 October 2011Publication History

ABSTRACT

Modern data centers can consist of hundreds of thousands of servers and millions of virtualized end hosts. Managing address assignment while simultaneously enabling scalable communication is a challenge in such an environment. We present ALIAS, an addressing and communication protocol that automates topology discovery and address assignment for the hierarchical topologies that underlie many data center network fabrics. Addresses assigned by ALIAS interoperate with a variety of scalable communication techniques. ALIAS is fully decentralized, scales to large network sizes, and dynamically recovers from arbitrary failures, without requiring modifications to hosts or to commodity switch hardware. We demonstrate through simulation that ALIAS quickly and correctly configures networks that support up to hundreds of thousands of hosts, even in the face of failures and erroneous cabling, and we show that ALIAS is a practical solution for auto-configuration with our NetFPGA testbed implementation.

References

  1. Cisco data center infrastructure 2.5 design guide. http://tinyurl.com/23486bs.Google ScholarGoogle Scholar
  2. Openflow. www.openflowswitch.org.Google ScholarGoogle Scholar
  3. Human errors most common reason for data center outages. http://tinyurl.com/dbzhn2, 2007.Google ScholarGoogle Scholar
  4. J. H. Ahn, N. Binkert, A. Davis, M. McLaren, and R. S. Schreiber. HyperX: topology, routing, and packaging of efficient large-scale networks. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 41:1--41:11, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Al-Fares, A. Loukissas, and A. Vahdat. A scalable, commodity data center network architecture. In Proceedings of the ACM SIGCOMM 2008 conference on Data communication, SIGCOMM '08, pages 63--74, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. N. Bhuyan and D. P. Agrawal. Generalized hypercube and hyperbus structures for a computer network. IEEE Transactions on Computers, 33:323--333, April 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Chen, C. Guo, H. Wu, J. Yuan, Z. Feng, Y. Chen, S. Lu, and W. Wu. Generic and automatic address configuration for data center networks. In Proceedings of the ACM SIGCOMM 2010 conference on SIGCOMM, SIGCOMM '10, pages 39--50, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Dao, J. Albrecht, C. Killian, and A. Vahdat. Live debugging of distributed systems. In Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, CC '09, pages 94--108, Berlin, Heidelberg, 2009. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: a scalable and flexible data center network. In Proceedings of the ACM SIGCOMM 2009 conference on Data communication, SIGCOMM '09, pages 51--62, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu. BCube: a high performance, server-centric network architecture for modular data centers. In Proceedings of the ACM SIGCOMM 2009 conference on Data communication, SIGCOMM '09, pages 63--74, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. N. Hoover. Inside microsoft's $550 million mega data centers. http://tinyurl.com/5cq4n6, 2008.Google ScholarGoogle Scholar
  12. C. Hopps. Analysis of an equal-cost multi-path algorithm, 2000.Google ScholarGoogle Scholar
  13. Juniper. What is behind network downtime? http://tinyurl.com/6k23ay6, May 2008.Google ScholarGoogle Scholar
  14. Z. Kerravala. As the value of enterprise networks escalates, so does the need for configuration management. http://tinyurl.com/6a2xox6, January 2004.Google ScholarGoogle Scholar
  15. C. Killian, J. W. Anderson, R. Jhala, and A. Vahdat. Life, death, and the critical transition: finding liveness bugs in systems code. In Proceedings of the 4th USENIX conference on Networked systems design & implementation, NSDI'07, pages 18--18, Berkeley, CA, USA, 2007. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. E. Killian, J. W. Anderson, R. Braud, R. Jhala, and A. M. Vahdat. Mace: language support for building distributed systems. In Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, PLDI '07, pages 179--188, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Kim, M. Caesar, and J. Rexford. Floodless in seattle: a scalable ethernet architecture for large enterprises. In Proceedings of the ACM SIGCOMM 2008 conference on Data communication, SIGCOMM '08, pages 3--14, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. E. Leiserson. Fat-trees: universal networks for hardware-efficient supercomputing. IEEE Transactions on Computers, 34:892--901, October 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. W. Lockwood, N. McKeown, G. Watson, G. Gibb, P. Hartke, J. Naous, R. Raghuraman, and J. Luo. Netfpga--an open platform for gigabit-rate network switching and routing. In Proceedings of the 2007 IEEE International Conference on Microelectronic Systems Education, MSE '07, pages 160--161, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Niranjan Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat. PortLand: a scalable fault-tolerant layer 2 data center network fabric. In Proceedings of the ACM SIGCOMM 2009 conference on Data communication, SIGCOMM '09, pages 39--50, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J.-H. Park, H. Yoon, and H.-K. Lee. The deflection self-routing banyan network: a large-scale ATM switch using the fully adaptive self-routing and its performance analyses. IEEE/ACM Transactions on Networks (TON), 7:588--604, August 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Perlman. An algorithm for distributed computation of a spanningtree in an extended LAN. In Proceedings of the ninth symposium on Data communications, SIGCOMM '85, pages 44--53, New York, NY, USA, 1985. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Perlman. Rbridges: transparent routing. In INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies, volume 2, pages 1211--1218 vol.2, March 2004.Google ScholarGoogle Scholar
  24. T. L. Rodeheffer, C. A. Thekkath, and D. C. Anderson. Smartbridge: a scalable bridge architecture. In Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, SIGCOMM '00, pages 205--216, New York, NY, USA, 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Schroeder, A. Birrell, M. Burrows, H. Murray, R. Needham, T. Rodeheffer, E. Satterthwaite, and C. Thacker. Autonet: a high-speed, self-configuring local area network using point-to-point links. IEEE Journal on Selected Areas in Communications, 9(8):1318--1335, October 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. H. J. Siegel and C. B. Stunkel. Inside parallel computers: Trends in interconnection networks. IEEE Computer Science & Engineering, 3:69--71, September 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Touch and R. Perlman. Transparent interconnection of lots of links (TRILL): Problem and applicability statement, RFC 5556, May 2009.Google ScholarGoogle Scholar
  28. P. F. Tsuchiya. The landmark hierarchy: a new hierarchy for routing in very large networks. In Symposium proceedings on Communications architectures and protocols, SIGCOMM '88, pages 35--42, New York, NY, USA, 1988. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. WalraedSullivan, R. Niranjan Mysore, K. Marzullo, and A. Vahdat. Brief Announcement: A Randomized Algorithm for Label Assignment in Dynamic Networks. In Proceedings of the 25th International Symposium on DIStributed Computing, DISC '11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ALIAS: scalable, decentralized label assignment for data centers

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SOCC '11: Proceedings of the 2nd ACM Symposium on Cloud Computing
          October 2011
          377 pages
          ISBN:9781450309769
          DOI:10.1145/2038916

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 October 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate169of722submissions,23%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader