skip to main content
10.1145/3098822.3098840acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free Access

Credit-Scheduled Delay-Bounded Congestion Control for Datacenters

Published:07 August 2017Publication History

ABSTRACT

Small RTTs (~tens of microseconds), bursty flow arrivals, and a large number of concurrent flows (thousands) in datacenters bring fundamental challenges to congestion control as they either force a flow to send at most one packet per RTT or induce a large queue build-up. The widespread use of shallow buffered switches also makes the problem more challenging with hosts generating many flows in bursts. In addition, as link speeds increase, algorithms that gradually probe for bandwidth take a long time to reach the fair-share. An ideal datacenter congestion control must provide 1) zero data loss, 2) fast convergence, 3) low buffer occupancy, and 4) high utilization. However, these requirements present conflicting goals.

This paper presents a new radical approach, called ExpressPass, an end-to-end credit-scheduled, delay-bounded congestion control for datacenters. ExpressPass uses credit packets to control congestion even before sending data packets, which enables us to achieve bounded delay and fast convergence. It gracefully handles bursty flow arrivals. We implement ExpressPass using commodity switches and provide evaluations using testbed experiments and simulations. ExpressPass converges up to 80 times faster than DCTCP in 10 Gbps links, and the gap increases as link speeds become faster. It greatly improves performance under heavy incast workloads and significantly reduces the flow completion times, especially, for small and medium size flows compared to RCP, DCTCP, HULL, and DX under realistic workloads.

Skip Supplemental Material Section

Supplemental Material

creditscheduleddelayboundedcongestioncontrolfordatacenters.webm

References

  1. Alexandru Agache and Costin Raiciu. 2015. Oh Flow, Are Thou Happy? TCP Sendbuffer Advertising for Make Benefit of Clouds and Tenants. In Proceedings of the 7th USENIX Conference on Hot Topics in Cloud Computing.Google ScholarGoogle Scholar
  2. Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. 2008. A scalable, commodity data center network architecture. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Mohammad Alizadeh, Albert Greenberg, David A Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data center TCP (dctcp). In ACM SIGCOMM.Google ScholarGoogle Scholar
  4. Mohammad Alizadeh, Adel Javanmard, and Balaji Prabhakar. 2011. Analysis of DCTCP: stability, convergence, and fairness. In Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Mohammad Alizadeh, Abdul Kabbani, Tom Edsall, Balaji Prabhakar, Amin Vahdat, and Masato Yasuda. 2012. Less is more: trading a little bandwidth for ultra-low latency in the data center. In USENIX Symposium on Networked Systems Design and Implementation.Google ScholarGoogle Scholar
  6. Mohammad Alizadeh, Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, and Scott Shenker. 2013. pfabric: Minimal near-optimal datacenter transport. In ACM SIGCOMM.Google ScholarGoogle Scholar
  7. Ganesh Ananthanarayanan, Srikanth Kandula, Albert G Greenberg, Ion Stoica, Yi Lu, Bikas Saha, and Edward Harris. 2010. Reining in the Outliers in Map-Reduce Clusters using Mantri. In USENIX OSDI.Google ScholarGoogle Scholar
  8. Arista Networks. 2016. Architecting Low Latency Cloud Networks. https://www.arista.com/assets/data/pdf/CloudNetworkLatency.pdf. (2016). [Online; accessed Jan-2017].Google ScholarGoogle Scholar
  9. Arista Networks. 2016. Arista 7280R Series Data Center Switch Router Data Sheet. https://www.arista.com/assets/data/pdf/Datasheets/7280R-DataSheet.pdf. (2016). [Online; accessed Jan-2017].Google ScholarGoogle Scholar
  10. Arista Networks. 2017. 7050SX Series 10/40G Data Center Switches Data Sheet. https://www.arista.com/assets/data/pdf/Datasheets/7050SX-128_64_Datasheet.pdf. (2017). [Online; accessed Jan-2017].Google ScholarGoogle Scholar
  11. Wei Bai, Li Chen, Kai Chen, Dongsu Han, Chen Tian, and Hao Wang. 2015. Information-agnostic flow scheduling for commodity data centers. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15).Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Andreas Bechtolsheim, Lincoln Dale, Hugh Holbrook, and Ang Li. 2016. Why Big Data Needs Big Buffer Switches. https://www.arista.com/assets/data/pdf/Whitepapers/BigDataBigBuffers-WP.pdf. (2016). [Online; accessed Jan-2017].Google ScholarGoogle Scholar
  13. Theophilus Benson, Aditya Akella, and David A. Maltz. 2010. Network Traffic Characteristics of Data Centers in the Wild. In Proc. 10th ACM SIGCOMM Conference on Internet Measurement. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Bob Briscoe and Koen De Schepper. 2015. Scaling tcp's congestion window for small round trip times. Technical report TR-TUB8-2015-002, BT (2015).Google ScholarGoogle Scholar
  15. Broadcom. 2012. Smart-Hash --- Broadcom. https://docs.broadcom.com/docs/12358326. (2012). [Online; accessed Jan-2017].Google ScholarGoogle Scholar
  16. Jay Chen, Janardhan Iyengar, Lakshminarayanan Subramanian, and Bryan Ford. 2011. TCP Behavior in Sub Packet Regimes. In Proc. ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems. 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Cisco. 2013. Nexus 7000 FabricPath. http://www.cisco.com/c/en/us/products/collateral/switches/nexus-7000-series-switches/white_paper_c11-687554.html. (2013). [Online; accessed Jan-2017; Section 7.2.1 Equal-Cost Multipath Forwarding].Google ScholarGoogle Scholar
  18. Chelsio Communications. 2013. Preliminary Ultra Low Latency Report. http://www.chelsio.com/wp-content/uploads/2013/10/Ultra-Low-Latency-Report.pdf. (2013). [Online; accessed Jan-2017].Google ScholarGoogle Scholar
  19. Sujal Das and Rochan Sankar. 2012. Broadcom Smart-Buffer Technology in Data Center Switches for Cost-Effective Performance Scaling of Cloud Applications. https://docs.broadcom.com/docs/12358325. (2012). [Online; accessed Jan-2017].Google ScholarGoogle Scholar
  20. Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Dell. 2015. Dell Networking Configuration Guide for the MXL 10/40GbE Switch I/O Module 9.9(0.0). http://topics-cdn.dell.com/pdf/force10-mxl-blade_Service%20Manual4_en-us.pdf. (2015). [Online; accessed Jan-2017. Enabling Deterministic ECMP Next Hop (pp.329)].Google ScholarGoogle Scholar
  22. Advait Dixit, Pawan Prakash, Y Charlie Hu, and Ramana Rao Kompella. 2013. On the impact of packet spraying in data center networks. In INFOCOM, 2013 Proceedings IEEE. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  23. Nandita Dukkipati. 2008. Rate Control Protocol (RCP): Congestion control to make flows complete quickly. Stanford University.Google ScholarGoogle Scholar
  24. Nandita Dukkipati, Masayoshi Kobayashi, Rui Zhang-Shen, and Nick McKeown. 2005. Processor sharing flows in the internet. In International Workshop on Quality of Service. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi. 2012. Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XVII). ACM, New York, NY, USA, 12. DOI: https://doi.org/10.1145/2150976.2150982Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Peter X Gao, Akshay Narayan, Gautam Kumar, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. 2015. pHost: Distributed near-optimal datacenter transport over commodity network fabric. In ACM CoNEXT.Google ScholarGoogle Scholar
  27. Rajib Ghosh and George Varghese. 2001. Modifying Shortest Path Routing Protocols to Create Symmetrical Routes. (2001). UCSD technical report CS2001-0685, September 2001.Google ScholarGoogle Scholar
  28. Albert Greenberg, James R Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: a scalable and flexible data center network. In ACM SIGCOMM.Google ScholarGoogle Scholar
  29. Sangtae Ha, Injong Rhee, and Lisong Xu. 2008. CUBIC: a new TCP-friendly high-speed TCP variant. ACM SIGOPS Operating Systems Review 42, 5 (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Dongsu Han, Robert Grandl, Aditya Akella, and Srinivasan Seshan. 2013. FCP: A Flexible Transport Framework for Accommodating Diversity. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sangjin Han, Keon Jang, Aurojit Panda, Shoumik Palkar, Dongsu Han, and Sylvia Ratnasamy. 2015. SoftNIC: A software NIC to augment hardware. In Technical Report UCB/EECS-2015-155. EECS Department, University of California, Berkeley.Google ScholarGoogle Scholar
  32. Jiawei Huang, Yi Huang, Jianxin Wang, and Tian He. 2015. Packet slicing for highly concurrent TCPs in data center networks with COTS switches. In IEEE ICNP. Google ScholarGoogle ScholarCross RefCross Ref
  33. Raj Jain, Dah-Ming Chiu, and William R Hawe. 1984. A quantitative measure of fairness and discrimination for resource allocation in shared computer system. (1984).Google ScholarGoogle Scholar
  34. Lavanya Jose, Lisa Yan, Mohammad Alizadeh, George Varghese, Nick McKeown, and Sachin Katti. 2015. High speed networks need proactive congestion control. In Proceedings of the 14th ACM Workshop on Hot Topics in Networks. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Dina Katabi, Mark Handley, and Charlie Rohrs. 2002. Congestion control for high bandwidth-delay product networks. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. HT Kung, Trevor Blackwell, and Alan Chapman. 1994. Credit-based flow control for ATM networks: credit update protocol, adaptive credit allocation and statistical multiplexing. In ACM SIGCOMM.Google ScholarGoogle Scholar
  37. Jean-Yves Le Boudec and Patrick Thiran. 2001. Network Calculus: A Theory of Deterministic Queuing Systems for the Internet. Springer-Verlag, Berlin, Heidelberg. Google ScholarGoogle ScholarCross RefCross Ref
  38. Changhyun Lee, Chunjong Park, Keon Jang, Sue Moon, and Dongsu Han. 2015. Accurate latency-based congestion feedback for datacenters. In USENIX Annual Technical Conference.Google ScholarGoogle Scholar
  39. Steven McCanne, Sally Floyd, Kevin Fall, Kannan Varadhan, and others. 1997. Network simulator ns-2. (1997).Google ScholarGoogle Scholar
  40. Microsoft. 2015. Azure support for Linux RDMA. https://azure.microsoft.com/en-us/updates/azure-support-for-linux-rdma. (2015). Online; accessed 12-July-2016.Google ScholarGoogle Scholar
  41. Radhika Mittal, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi, Amin Vahdat, Yaogong Wang, David Wetherall, David Zats, and others. 2015. TIMELY: RTT-based Congestion Control for the Datacenter. In ACM SIGCOMM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Radhika Mittal, Justine Sherry, Sylvia Ratnasamy, and Scott Shenker. 2014. Recursively Cautious Congestion Control. In USENIX Conference on Networked Systems Design and Implementation.Google ScholarGoogle Scholar
  43. Ali Munir, Ghufran Baig, Syed M Irteza, Ihsan A Qazi, Alex X Liu, and Fahad R Dogar. 2014. Friends, not foes: synthesizing existing transport strategies for data center networks. In ACM SIGCOMM.Google ScholarGoogle Scholar
  44. Kanthi Nagaraj, Dinesh Bharadia, Hongzi Mao, Sandeep Chinchali, Mohammad Alizadeh, and Sachin Katti. 2016. NUMFabric: Fast and Flexible Bandwidth Allocation in Datacenters. In ACM SIGCOMM. 14.Google ScholarGoogle Scholar
  45. Juniper Networks. 2016. Configuring PIC-Level Symmetrical Hashing for Load Balancing on 802.3ad LAGs for MX Series Routers. https://www.juniper.net/techpubs/en_US/junos15.1/topics/task/configuration/802-3ad-lags-load-balancing-symmetric-hashing-mx-series-pic-level-configuring.html. (2016). [Online; accessed Jan-2017].Google ScholarGoogle Scholar
  46. Jitendra Padhye, Victor Firoiu, Don Towsley, and Jim Kurose. 1998. Modeling TCP throughput: A simple model and its empirical validation. ACM SIGCOMM Computer Communication Review 28, 4 (1998).Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Jonathan Perry, Amy Ousterhout, Hari Balakrishnan, Devavrat Shah, and Hans Fugal. 2014. Fastpass: A centralized zero-queue datacenter network. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Sivasankar Radhakrishnan, Yilong Geng, Vimalkumar Jeyakumar, Abdul Kabbani, George Porter, and Amin Vahdat. 2014. SENIC: Scalable NIC for End-Host Rate Limiting.. In NSDI, Vol. 14.Google ScholarGoogle Scholar
  49. Sivasankar Radhakrishnan, Vimalkumar Jeyakumar, Abdul Kabbani, George Porter, and Amin Vahdat. 2013. NicPic: Scalable and Accurate End-Host Rate Limiting. In USENIX HotCloud.Google ScholarGoogle Scholar
  50. Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, and Alex C Snoeren. 2015. Inside the social network's (datacenter) network. In ACM SIGCOMM Computer Communication Review, Vol. 45. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. M. Schlansker, J. Tourrilhes, and Y. Turner. 2015. Method for routing data packets in a fat tree network. (April 14 2015). https://www.google.com/patents/US9007895 US Patent 9,007,895.Google ScholarGoogle Scholar
  52. Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armistead, Roy Bannon, Seb Boving, Gaurav Desai, Bob Felderman, Paulie Germano, and others. 2015. Jupiter rising: A decade of clos topologies and centralized control in google's datacenter network. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. David Slogsnat, Alexander Giese, and Ulrich Brüning. 2007. A Versatile, Low Latency HyperTransport Core. In ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 8. DOI:https://doi.org/10.1145/1216919.1216926Google ScholarGoogle Scholar
  54. Jim Warner. 2014. Packet Buffer. https://people.ucsc.edu/~warner/buffer.html. (2014). [Online; accessed Jan-2017].Google ScholarGoogle Scholar
  55. H. Wu, Z. Feng, C. Guo, and Y. Zhang. 2013. ICTCP: Incast Congestion Control for TCP in Data-Center Networks. IEEE/ACM Transactions on Networking 21, 2 (2013).Google ScholarGoogle Scholar
  56. Lisong Xu, Khaled Harfoush, and Injong Rhee. 2004. Binary increase congestion control (BIC) for fast long-distance networks. In INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies, Vol. 4. IEEE.Google ScholarGoogle Scholar
  57. Xiaowei Yang, David Wetherall, and Thomas Anderson. 2005. A DoS-limiting Network Architecture. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion control for large-scale RDMA deployments. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Yibo Zhu, Monia Ghobadi, Vishal Misra, and Jitendra Padhye. 2016. ECN or Delay: Lessons Learnt from Analysis of DCQCN and TIMELY. In ACM CoNEXT.Google ScholarGoogle Scholar

Index Terms

  1. Credit-Scheduled Delay-Bounded Congestion Control for Datacenters

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGCOMM '17: Proceedings of the Conference of the ACM Special Interest Group on Data Communication
      August 2017
      515 pages
      ISBN:9781450346535
      DOI:10.1145/3098822

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 August 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate554of3,547submissions,16%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader