ABSTRACT
Homa is a new transport protocol for datacenter networks. It provides exceptionally low latency, especially for workloads with a high volume of very short messages, and it also supports large messages and high network utilization. Homa uses in-network priority queues to ensure low latency for short messages; priority allocation is managed dynamically by each receiver and integrated with a receiver-driven flow control mechanism. Homa also uses controlled overcommitment of receiver downlinks to ensure efficient bandwidth utilization at high load. Our implementation of Homa delivers 99th percentile round-trip times less than 15 μs for short messages on a 10 Gbps network running at 80% load. These latencies are almost 100x lower than the best published measurements of an implementation. In simulations, Homa's latency is roughly equal to pFabric and significantly better than pHost, PIAS, and NDP for almost all message sizes and workloads. Homa can also sustain higher network loads than pFabric, pHost, or PIAS.
- M. Alizadeh, T. Edsall, S. Dharmapurikar, R. Vaidyanathan, K. Chu, A. Fingerhut, V. T. Lam, F. Matus, R. Pan, N. Yadav, and G. Varghese. CONGA: Distributed Congestion-aware Load Balancing for Datacenters. In Proceedings of the ACM SIGCOMM 2014 Conference, SIGCOMM '14, pages 503--514, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data Center TCP (DCTCP). In Proceedings of the ACM SIGCOMM 2010 Conference, SIGCOMM '10, pages 63--74, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- M. Alizadeh, A. Kabbani, T. Edsall, B. Prabhakar, A. Vahdat, and M. Yasuda. Less is More: Trading a Little Bandwidth for Ultra-low Latency in the Data Center. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI'12, pages 19--19, Berkeley, CA, USA, 2012. USENIX Association. Google ScholarDigital Library
- M. Alizadeh, S. Yang, M. Sharif, S. Katti, N. McKeown, B. Prabhakar, and S. Shenker. pFabric: Minimal Near-optimal Datacenter Transport. In Proceedings of the ACM SIGCOMM 2013 Conference, SIGCOMM '13, pages 435--446, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
- B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Workload Analysis of a Large-scale Key-value Store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '12, pages 53--64, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- W. Bai, L. Chen, K. Chen, D. Han, C. Tian, and H. Wang. Information-agnostic Flow Scheduling for Commodity Data Centers. In Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation, NSDI'15, pages 455--468, Berkeley, CA, USA, 2015. USENIX Association. Google ScholarDigital Library
- L. Chen, K. Chen, W. Bai, and M. Alizadeh. Scheduling Mix-flows in Commodity Datacenters with Karuna. In Proceedings of the ACM SIGCOMM 2016 Conference, SIGCOMM '16, pages 174--187, New York, NY, USA, 2016. ACM. Google ScholarDigital Library
- I. Cho, K. Jang, and D. Han. Credit-Scheduled Delay-Bounded Congestion Control for Datacenters. In Proceedings of the ACM SIGCOMM 2017 Conference, SIGCOMM '17, pages 239--252, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- Data Plane Development Kit. http://dpdk.org/.Google Scholar
- A. Dixit, P. Prakash, Y. C. Hu, and R. R. Kompella. On the Impact of Packet Spraying in Data Center Networks. In Proceedings of IEEE Infocom, 2013.Google ScholarCross Ref
- A. Dragojević, D. Narayanan, M. Castro, and O. Hodson. FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 401--414, Seattle, WA, Apr. 2014. USENIX Association. Google ScholarDigital Library
- B. Felderman. Personal communication, February 2018. Google.Google Scholar
- P. X. Gao, A. Narayan, G. Kumar, R. Agarwal, S. Ratnasamy, and S. Shenker. pHost: Distributed Near-optimal Datacenter Transport over Commodity Network Fabric. In Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies, CoNEXT '15, pages 1:1--1:12, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- M. P. Grosvenor, M. Schwarzkopf, I. Gog, R. N. M. Watson, A. W. Moore, S. Hand, and J. Crowcroft. Queues Don't Matter When You Can JUMP Them! In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15), pages 1--14, Oakland, CA, 2015. USENIX Association. Google ScholarDigital Library
- M. Handley, C. Raiciu, A. Agache, A. Voinescu, A. W. Moore, G. Antichik, and M. Mojcik. Re-architecting Datacenter Networks and Stacks for Low Latency and High Performance. In Proceedings of the ACM SIGCOMM 2017 Conference, SIGCOMM '17, pages 29--42, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- K. He, E. Rozner, K. Agarwal, W. Felter, J. Carter, and A. Akella. Presto: Edge-based Load Balancing for Fast Datacenter Networks. In Proceedings of the ACM SIGCOMM 2015 Conference, SIGCOMM '15, pages 465--478, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- C.-Y. Hong, M. Caesar, and P. B. Godfrey. Finishing Flows Quickly with Preemptive Scheduling. In Proceedings of the ACM SIGCOMM 2012 Conference, SIGCOMM '12, pages 127--138, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- E. Jeong, S. Wood, M. Jamshed, H. Jeong, S. Ihm, D. Han, and K. Park. mTCP: a Highly Scalable User-level TCP Stackfor Multicore Systems. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 489--502, Seattle, WA, 2014. USENIX Association. Google ScholarDigital Library
- C. Lee, S. J. Park, A. Kejriwal, S. Matsushita, and J. Ousterhout. Implementing Linearizability at Large Scale and Low Latency. In Proceedings of the 25th Symposium on Operating Systems Principles, SOSP '15, pages 71--86, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- memcached: a Distributed Memory Object Caching System. http://www.memcached.org/, Jan. 2011.Google Scholar
- R. Mittal, V. T. Lam, N. Dukkipati, E. Blem, H. Wassel, M. Ghobadi, A. Vahdat, Y. Wang, D. Wetherall, and D. Zats. TIMELY: RTT-based Congestion Control for the Datacenter. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, SIGCOMM '15, pages 537--550, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- B. Montazeri, Y. Li, M. Alizadeh, and J. K. Ousterhout. Homa: A Receiver-Driven Low-Latency Transport Protocol Using Network Priorities (Complete Version). CoRR, http://arxiv.org/abs/1803.09615, 2018. Google ScholarDigital Library
- R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 385--398, Lombard, IL, 2013. USENIX. Google ScholarDigital Library
- J. Ousterhout, A. Gopalan, A. Gupta, A. Kejriwal, C. Lee, B. Montazeri, D. Ongaro, S. J. Park, H. Qin, M. Rosenblum, et al. The RAMCloud Storage System. ACM Transactions on Computer Systems (TOCS), 33(3):7,2015. Google ScholarDigital Library
- J. Perry, A. Ousterhout, H. Balakrishnan, D. Shah, and H. Fugal. Fastpass: A Centralized "Zero-queue" Datacenter Network. In Proceedings of the ACM SIGCOMM 2014 Conference, SIGCOMM '14, pages 307--318, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- Redis, Mar. 2015. http://redis.io.Google Scholar
- A. Roy, H. Zeng, J. Bagga, G. Porter, and A. C. Snoeren. Inside the Social Network's (Datacenter) Network. In Proceedings of the ACM SIGCOMM 2015 Conference, SIGCOMM '15, pages 123--137, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- T. Shanley. Infiniband Network Architecture. Addison-Wesley Professional, 2003. Google ScholarDigital Library
- R. Sivaram. Some Measured Google Flow Sizes (2008). Google internal memo, available on request.Google Scholar
- BCM56960 Series: High-Density 25/100 Gigabit Ethernet StrataXGS Tomahawk Ethernet Switch Series. https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56960-series.Google Scholar
- B. Vamanan, J. Hasan, and T. Vijaykumar. Deadline-aware Datacenter TCP (D2TCP). In Proceedings of the ACM SIGCOMM 2012 Conference, SIGCOMM '12, pages 115--126, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- C. Wilson, H. Ballani, T. Karagiannis, and A. Rowtron. Better Never Than Late: Meeting Deadlines in Datacenter Networks. In Proceedings of the ACM SIGCOMM 2011 Conference, SIGCOMM '11, pages 50--61, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- D. Zats, T. Das, P. Mohan, D. Borthakur, and R. Katz. Detail: Reducing the flow completion time tail in datacenter networks. In Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, SIGCOMM '12, pages 139--150, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- Y. Zhu, H. Eran, D. Firestone, C. Guo, M. Lipshteyn, Y. Liron, J. Padhye, S. Raindel, M. H. Yahia, and M. Zhang. Congestion Control for Large-Scale RDMA Deployments. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, SIGCOMM '15, pages 523--536, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
Index Terms
- Homa: a receiver-driven low-latency transport protocol using network priorities
Recommendations
Re-architecting datacenter networks and stacks for low latency and high performance
SIGCOMM '17: Proceedings of the Conference of the ACM Special Interest Group on Data CommunicationModern datacenter networks provide very high capacity via redundant Clos topologies and low switch latency, but transport protocols rarely deliver matching performance. We present NDP, a novel data-center transport architecture that achieves near-...
TCP tunnels: avoiding congestion collapse
LCN '00: Proceedings of the 25th Annual IEEE Conference on Local Computer NetworksThis paper examines the attributes of TCP tunnels which are TCP circuits that carry IP packets and benefit from the congestion control mechanism of TCP/IP. The deployment of TCP tunnels reduces the many flows situation on the Internet to that of a few ...
Non-Renegable Selective Acknowledgments (NR-SACKs) for SCTP
ICNP '08: Proceedings of the 2008 IEEE International Conference on Network ProtocolsIn both TCP and SCTP, selectively acked (SACKed) out-of-order data is implicitly renegable; that is, the receiver can later discard SACKed data. The possibility of reneging forces the transport sender to maintain copies of SACKed data in the send buffer ...
Comments