skip to main content
research-article

Got loss? Get zOVN!

Published: 27 August 2013 Publication History

Abstract

Datacenter networking is currently dominated by two major trends. One aims toward lossless, flat layer-2 fabrics based on Converged Enhanced Ethernet or InfiniBand, with benefits in efficiency and performance. The other targets flexibility based on Software Defined Networking, which enables Overlay Virtual Networking. Although clearly complementary, these trends also exhibit some conflicts: In contrast to physical fabrics, which avoid packet drops by means of flow control, practically all current virtual networks are lossy. We quantify these losses for several common combinations of hypervisors and virtual switches, and show their detrimental effect on application performance. Moreover, we propose a zero-loss Overlay Virtual Network (zOVN) designed to reduce the query and flow completion time of latency-sensitive datacenter applications. We describe its architecture and detail the design of its key component, the zVALE lossless virtual switch. As proof of concept, we implemented a zOVN prototype and benchmark it with Partition-Aggregate in two testbeds, achieving an up to 15-fold reduction of the mean completion time with three widespread TCP versions. For larger-scale validation and deeper introspection into zOVN, we developed an OMNeT++ model for accurate cross-layer simulations of a virtualized datacenter, which confirm the validity of our results.

References

[1]
Iperf. URL: http://iperf.sourceforge.net.
[2]
Linux Bridge. URL: http://www.linuxfoundation. org/collaborate/workgroups/networking/bridge.
[3]
Open vSwitch. URL: http://openvswitch.org.
[4]
QEMU-KVM. URL: http://www.linux-kvm.org.
[5]
Fabric convergence with lossless Ethernet and Fibre Channel over Ethernet (FCoE), 2008. URL: http://www.bladenetwork.net/userfiles/file/ PDFs/WP_Fabric_Convergence.pdf.
[6]
P802.1Qbb/D2.3 - Virtual Bridged Local Area Networks - Amendment: Priority-based Flow Control, 2011. URL: http://www.ieee802.org/1/pages/802.1bb.html.
[7]
M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic Flow Scheduling for Data Center Networks. In Proc. NSDI 2010, San Jose, CA, April 2010.
[8]
M. Alizadeh, A. Greenberg, D. A. Maltz, et al. DCTCP: Efficient Packet Transport for the Commoditized Data Center. In Proc. ACM SIGCOMM 2010, New Delhi, India, August 2010.
[9]
M. Alizadeh, A. Kabbani, T. Edsall, B. Prabhakar, A. Vahdat, and M. Yasuda. Less is More: Trading a little Bandwidth for Ultra-Low Latency in the Data Center. In Proc. NSDI 2012, San Jose, CA, April 2012.
[10]
M. Alizadeh, S. Yang, S. Katti, N. McKeown, et al. Deconstructing Datacenter Packet Transport. In Proc. HotNets 2012, Redmond, WA.
[11]
K. Barabash, R. Cohen, D. Hadas, V. Jain, et al. A Case for Overlays in DCN Virtualization. In Proc. DCCAVES'11, San Francisco, CA.
[12]
P. Baran. On Distributed Communications Networks. IEEE Transactions on Communications, 12(1):1--9, March 1964.
[13]
{13} R. Birke, D. Crisan, K. Barabash, A. Levin, C. DeCusatis, C. Minkenberg, and M. Gusat. Partition/Aggregate in Commodity 10G Ethernet Software-Defined Networking. In Proc. HPSR 2012, Belgrade, Serbia, June 2012.
[14]
M. S. Blumenthal and D. D. Clark. Rethinking the Design of the Internet: The End-to-End Arguments vs. the Brave New World. ACM Transactions on Internet Technology, 1(1):70--109, August 2001.
[15]
Y. Chen, R. Griffith, J. Liu, R. H. Katz, and A. D. Joseph. Understanding TCP Incast Throughput Collapse in Datacenter Networks. In Proc. WREN 2009, Barcelona, Spain, August 2009.
[16]
D. Cohen, T. Talpey, A. Kanevsky, et al. Remote Direct Memory Access over the Converged Enhanced Ethernet Fabric: Evaluating the Options. In Proc. HOTI 2009, New York, NY, August 2009.
[17]
R. Cohen, K. Barabash, B. Rochwerger, L. Schour, D. Crisan, R. Birke, C. Minkenberg, M. Gusat, et al. An Intent-based Approach for Network Virtualization. In Proc. IFIP/IEEE IM 2013, Ghent, Belgium.
[18]
D. Crisan, A. S. Anghel, R. Birke, C. Minkenberg, and M. Gusat. Short and Fat: TCP Performance in CEE Datacenter Networks. In Proc. HOTI 2011, Santa Clara, CA, August 2011.
[19]
W. Dally and B. Towles. Principles and Practices of Interconnection Networks, Chapter 13. Morgan Kaufmann Publishers Inc., San Francisco, CA, 2003.
[20]
N. Dukkipati and N. McKeown. Why Flow-Completion Time is the Right Metric for Congestion Control. ACM SIGCOMM CCR, 36(1):59--62, January 2006.
[21]
H. Grover, D. Rao, D. Farinacci, and V. Moreno. Overlay Transport Virtualization. Internet draft, IETF, July 2011.
[22]
M. Gusat, D. Crisan, C. Minkenberg, and C. DeCusatis. R3C2: Reactive Route and Rate Control for CEE. In Proc. HOTI 2010, Mountain View, CA, August 2010.
[23]
H. Han, S. Shakkottai, C. V. Hollot, R. Srikant, and D. Towsley. Multi-Path TCP: A Joint Congestion Control and Routing Scheme to Exploit Path Diversity in the Internet. IEEE/ACM Transactions on Networking, 14(6):1260--1271, December 2006.
[24]
C.-Y. Hong, M. Caesar, and P. B. Godfrey. Finishing Flows Quickly with Preemptive Scheduling. In Proc. ACM SIGCOMM 2012, Helsinky, Finland.
[25]
S. Kandula, D. Katabi, S. Sinha, and A. Berger. Dynamic Load Balancing Without Packet Reordering. ACM SIGCOMM Computer Communication Review, 37(2):53--62, April 2007.
[26]
M. Mahalingam, D. Dutt, K. Duda, et al. VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks. Internet draft, IETF, August 2011.
[27]
N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, et al. OpenFlow: Enabling Innovation in Campus Networks. ACM SIGCOMM Computer Communication Review, 38(2):69--74, April 2008.
[28]
J. Mudigonda, P. Yalagandula, J. C. Mogul, et al. NetLord: A Scalable Multi-Tenant Network Architecture for Virtualized Datacenters. In Proc. ACM SIGCOMM 2011, Toronto, Canada.
[29]
B. Pfaff, B. Lantz, B. Heller, C. Barker, et al. OpenFlow Switch Specification Version 1.1.0. Specification, Stanford University, February 2011. URL: http://www.openflow.org/documents/openflow-spec-v1.1.0.pdf.
[30]
G. Pfister and V. Norton. Hot Spot Contention and Combining in Multistage Interconnection Networks. IEEE Transactions on Computers, C-34(10):943--948, October 1985.
[31]
C. Raiciu, S. Barre, and C. Pluntke. Improving Datacenter Performance and Robustness with Multipath TCP. In Proc. ACM SIGCOMM 2011, Toronto, Canada, August 2011.
[32]
L. Rizzo. netmap: A Novel Framework for Fast Packet I/O. In Proc. USENIX ATC 2012, Boston, MA.
[33]
L. Rizzo and G. Lettieri. VALE, a Switched Ethernet for Virtual Machines. In Proc. CoNEXT 2012, Nice, France, December 2012.
[34]
R. Russell. virtio: Towards a De-Facto Standard For Virtual I/O Devices. ACM SIGOPS Operating System Review, 42(5):95--103, July 2008.
[35]
J. H. Saltzer, D. P. Reed, and D. D. Clark. End-to-End Arguments in System Design. ACM Transactions on Computer Systems, 2(4):277--288, November 1984.
[36]
M. Scharf and T. Banniza. MCTCP: A Multipath Transport Shim Layer. In Proc. IEEE GLOBECOM 2011, Houston, TX, December 2011.
[37]
M. Sridharan, K. Duda, I. Ganga, A. Greenberg, et al. NVGRE: Network Virtualization using Generic Routing Encapsulation. Internet draft, IETF, September 2011.
[38]
B. Vamanan, J. Hasan, and T. N. Vijaykumar. Deadline-Aware Datacenter TCP (D2TCP). In Proc. ACM SIGCOMM 2012, Helsinki, Finland.
[39]
V. Vasudevan, A. Phanishayee, H. Shah, E. Krevat, D. G. Andersen, G. R. Ganger, G. A. Gibson, and B. Mueller. Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication. In Proc. ACM SIGCOMM 2009, Barcelona, Spain.
[40]
G. Wang and T. S. E. Ng. The Impact of Virtualization on Network Performance of Amazon EC2 Data Center. In Proc. INFOCOM 2010, San Diego, CA, March 2010.
[41]
C. Wilson, H. Ballani, T. Karagiannis, and A. Rowstron. Better Never than Late: Meeting Deadlines in Datacenter Networks. In Proc. ACM SIGCOMM 2011, Toronto, Canada, August 2011.
[42]
D. Zats, T. Das, P. Mohan, D. Borthakur, and R. Katz. DeTail: Reducing the Flow Completion Time Tail in Datacenter Networks. In Proc. ACM SIGCOMM 2012, Helsinky, Finland, August 2012.

Cited By

View all
  • (2017)Low Latency Software Rate Limiters for Cloud NetworksProceedings of the First Asia-Pacific Workshop on Networking10.1145/3106989.3107005(78-84)Online publication date: 3-Aug-2017
  • (2017)Congestion control in Converged Ethernet with heterogeneous and time-varying delays2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS)10.1109/IWQoS.2017.7969115(1-10)Online publication date: Jun-2017
  • (2015)A survey on data center networking for cloud computingComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2015.08.04091:C(528-547)Online publication date: 14-Nov-2015
  • Show More Cited By

Index Terms

  1. Got loss? Get zOVN!

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGCOMM Computer Communication Review
    ACM SIGCOMM Computer Communication Review  Volume 43, Issue 4
    October 2013
    595 pages
    ISSN:0146-4833
    DOI:10.1145/2534169
    Issue’s Table of Contents
    • cover image ACM Conferences
      SIGCOMM '13: Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
      August 2013
      580 pages
      ISBN:9781450320566
      DOI:10.1145/2486001
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 August 2013
    Published in SIGCOMM-CCR Volume 43, Issue 4

    Check for updates

    Author Tags

    1. datacenter networking
    2. lossless
    3. overlay networks
    4. partition-aggregate
    5. virtualization

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)75
    • Downloads (Last 6 weeks)21
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)Low Latency Software Rate Limiters for Cloud NetworksProceedings of the First Asia-Pacific Workshop on Networking10.1145/3106989.3107005(78-84)Online publication date: 3-Aug-2017
    • (2017)Congestion control in Converged Ethernet with heterogeneous and time-varying delays2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS)10.1109/IWQoS.2017.7969115(1-10)Online publication date: Jun-2017
    • (2015)A survey on data center networking for cloud computingComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2015.08.04091:C(528-547)Online publication date: 14-Nov-2015
    • (2016)Empirical Evidences in Software-Defined Network Security: A Systematic Literature ReviewInformation Fusion for Cyber-Security Analytics10.1007/978-3-319-44257-0_11(253-295)Online publication date: 22-Oct-2016
    • (2015)Schemes for Fast Transmission of Flows in Data Center NetworksIEEE Communications Surveys & Tutorials10.1109/COMST.2015.242719917:3(1391-1422)Online publication date: Nov-2016
    • (2014)Guaranteeing end-to-end quality-of-service with a generic routing approachACM SIGAPP Applied Computing Review10.1145/2656864.265686514:2(8-22)Online publication date: 1-Jun-2014
    • (2014)Practical DCB for improved data center networksIEEE INFOCOM 2014 - IEEE Conference on Computer Communications10.1109/INFOCOM.2014.6848121(1824-1832)Online publication date: Apr-2014
    • (2014)zFabric: How to virtualize lossless ethernet?2014 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2014.6968740(75-83)Online publication date: Sep-2014
    • (2014)Rethinking the Data Center Networking: Architecture, Network Protocols, and Resource SharingIEEE Access10.1109/ACCESS.2014.23834392(1481-1496)Online publication date: 2014
    • (2013)A Measurement Study of Data-Intensive Network Traffic Patterns in a Private CloudProceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing10.1109/UCC.2013.93(476-481)Online publication date: 9-Dec-2013

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media