ABSTRACT
Modern data centers can consist of hundreds of thousands of servers and millions of virtualized end hosts. Managing address assignment while simultaneously enabling scalable communication is a challenge in such an environment. We present ALIAS, an addressing and communication protocol that automates topology discovery and address assignment for the hierarchical topologies that underlie many data center network fabrics. Addresses assigned by ALIAS interoperate with a variety of scalable communication techniques. ALIAS is fully decentralized, scales to large network sizes, and dynamically recovers from arbitrary failures, without requiring modifications to hosts or to commodity switch hardware. We demonstrate through simulation that ALIAS quickly and correctly configures networks that support up to hundreds of thousands of hosts, even in the face of failures and erroneous cabling, and we show that ALIAS is a practical solution for auto-configuration with our NetFPGA testbed implementation.
- Cisco data center infrastructure 2.5 design guide. http://tinyurl.com/23486bs.Google Scholar
- Openflow. www.openflowswitch.org.Google Scholar
- Human errors most common reason for data center outages. http://tinyurl.com/dbzhn2, 2007.Google Scholar
- J. H. Ahn, N. Binkert, A. Davis, M. McLaren, and R. S. Schreiber. HyperX: topology, routing, and packaging of efficient large-scale networks. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 41:1--41:11, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- M. Al-Fares, A. Loukissas, and A. Vahdat. A scalable, commodity data center network architecture. In Proceedings of the ACM SIGCOMM 2008 conference on Data communication, SIGCOMM '08, pages 63--74, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- L. N. Bhuyan and D. P. Agrawal. Generalized hypercube and hyperbus structures for a computer network. IEEE Transactions on Computers, 33:323--333, April 1984. Google ScholarDigital Library
- K. Chen, C. Guo, H. Wu, J. Yuan, Z. Feng, Y. Chen, S. Lu, and W. Wu. Generic and automatic address configuration for data center networks. In Proceedings of the ACM SIGCOMM 2010 conference on SIGCOMM, SIGCOMM '10, pages 39--50, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- D. Dao, J. Albrecht, C. Killian, and A. Vahdat. Live debugging of distributed systems. In Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, CC '09, pages 94--108, Berlin, Heidelberg, 2009. Springer-Verlag. Google ScholarDigital Library
- A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: a scalable and flexible data center network. In Proceedings of the ACM SIGCOMM 2009 conference on Data communication, SIGCOMM '09, pages 51--62, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu. BCube: a high performance, server-centric network architecture for modular data centers. In Proceedings of the ACM SIGCOMM 2009 conference on Data communication, SIGCOMM '09, pages 63--74, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- J. N. Hoover. Inside microsoft's $550 million mega data centers. http://tinyurl.com/5cq4n6, 2008.Google Scholar
- C. Hopps. Analysis of an equal-cost multi-path algorithm, 2000.Google Scholar
- Juniper. What is behind network downtime? http://tinyurl.com/6k23ay6, May 2008.Google Scholar
- Z. Kerravala. As the value of enterprise networks escalates, so does the need for configuration management. http://tinyurl.com/6a2xox6, January 2004.Google Scholar
- C. Killian, J. W. Anderson, R. Jhala, and A. Vahdat. Life, death, and the critical transition: finding liveness bugs in systems code. In Proceedings of the 4th USENIX conference on Networked systems design & implementation, NSDI'07, pages 18--18, Berkeley, CA, USA, 2007. USENIX Association. Google ScholarDigital Library
- C. E. Killian, J. W. Anderson, R. Braud, R. Jhala, and A. M. Vahdat. Mace: language support for building distributed systems. In Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, PLDI '07, pages 179--188, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- C. Kim, M. Caesar, and J. Rexford. Floodless in seattle: a scalable ethernet architecture for large enterprises. In Proceedings of the ACM SIGCOMM 2008 conference on Data communication, SIGCOMM '08, pages 3--14, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- C. E. Leiserson. Fat-trees: universal networks for hardware-efficient supercomputing. IEEE Transactions on Computers, 34:892--901, October 1985. Google ScholarDigital Library
- J. W. Lockwood, N. McKeown, G. Watson, G. Gibb, P. Hartke, J. Naous, R. Raghuraman, and J. Luo. Netfpga--an open platform for gigabit-rate network switching and routing. In Proceedings of the 2007 IEEE International Conference on Microelectronic Systems Education, MSE '07, pages 160--161, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarDigital Library
- R. Niranjan Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat. PortLand: a scalable fault-tolerant layer 2 data center network fabric. In Proceedings of the ACM SIGCOMM 2009 conference on Data communication, SIGCOMM '09, pages 39--50, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- J.-H. Park, H. Yoon, and H.-K. Lee. The deflection self-routing banyan network: a large-scale ATM switch using the fully adaptive self-routing and its performance analyses. IEEE/ACM Transactions on Networks (TON), 7:588--604, August 1999. Google ScholarDigital Library
- R. Perlman. An algorithm for distributed computation of a spanningtree in an extended LAN. In Proceedings of the ninth symposium on Data communications, SIGCOMM '85, pages 44--53, New York, NY, USA, 1985. ACM. Google ScholarDigital Library
- R. Perlman. Rbridges: transparent routing. In INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies, volume 2, pages 1211--1218 vol.2, March 2004.Google Scholar
- T. L. Rodeheffer, C. A. Thekkath, and D. C. Anderson. Smartbridge: a scalable bridge architecture. In Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, SIGCOMM '00, pages 205--216, New York, NY, USA, 2000. ACM. Google ScholarDigital Library
- M. Schroeder, A. Birrell, M. Burrows, H. Murray, R. Needham, T. Rodeheffer, E. Satterthwaite, and C. Thacker. Autonet: a high-speed, self-configuring local area network using point-to-point links. IEEE Journal on Selected Areas in Communications, 9(8):1318--1335, October 1991.Google ScholarDigital Library
- H. J. Siegel and C. B. Stunkel. Inside parallel computers: Trends in interconnection networks. IEEE Computer Science & Engineering, 3:69--71, September 1996. Google ScholarDigital Library
- J. Touch and R. Perlman. Transparent interconnection of lots of links (TRILL): Problem and applicability statement, RFC 5556, May 2009.Google Scholar
- P. F. Tsuchiya. The landmark hierarchy: a new hierarchy for routing in very large networks. In Symposium proceedings on Communications architectures and protocols, SIGCOMM '88, pages 35--42, New York, NY, USA, 1988. ACM. Google ScholarDigital Library
- M. WalraedSullivan, R. Niranjan Mysore, K. Marzullo, and A. Vahdat. Brief Announcement: A Randomized Algorithm for Label Assignment in Dynamic Networks. In Proceedings of the 25th International Symposium on DIStributed Computing, DISC '11, 2011. Google ScholarDigital Library
Index Terms
- ALIAS: scalable, decentralized label assignment for data centers
Recommendations
Virtual machine migration and management for vehicular clouds
Vehicular Cloud Computing is a growing research field which consolidates the benefit of cloud computing into vehicular ad hoc networks. However, few studies address vehicles as potential Virtual Machine hosts. Due to the rapidly changing environment of ...
On building next generation data centers: energy flow in the information technology stack
COMPUTE '08: Proceedings of the 1st Bangalore Annual Compute ConferenceThe demand for data center solutions with lower total cost of ownership and lower complexity of management is driving the creation of next generation datacenters The information technology industry is in the midst of a transformation to lower the cost ...
Handling Boot Storms in Virtualized Data Centers—A Survey
Large-scale virtual machine (VM) deployment in virtualized data centers is a very slow process. This is primarily due to the resource bottlenecks that are created at the storage, network, and host physical machines when a large number of VMs are ...
Comments