research-article

Memory Efficient Loss Recovery for Hardware-based Transport in Datacenter

Authors:
Yuanwei Lu

USTC & Microsoft Research

USTC & Microsoft Research
View Profile

,
Guo Chen

Microsoft Research

Microsoft Research
View Profile

,
Zhenyuan Ruan

USTC & Microsoft Research

USTC & Microsoft Research
View Profile

,
Wencong Xiao

BUAA & Microsoft Research

BUAA & Microsoft Research
View Profile

,
Bojie Li

USTC & Microsoft Research

USTC & Microsoft Research
View Profile

,
Jiansong Zhang

Microsoft Research

Microsoft Research
View Profile

,
Yongqiang Xiong

Microsoft Research

Microsoft Research
View Profile

,
Peng Cheng

Microsoft Research

Microsoft Research
View Profile

,
Enhong Chen

USTC

USTC
View Profile

APNet '17: Proceedings of the First Asia-Pacific Workshop on NetworkingAugust 2017Pages 22–28https://doi.org/10.1145/3106989.3106993

Published:03 August 2017Publication History

APNet '17: Proceedings of the First Asia-Pacific Workshop on Networking

Pages 22–28

ABSTRACT

Limited by the small on-chip memory, hardware-based transport typically implements go-back-N loss recovery mechanism, which costs very few memory but is well-known to perform inferior even under small packet loss ratio. We present MELO, an efficient selective retransmission mechanism for hardware-based transport, which consumes only a constant small memory regardless of the number of concurrent connections. Specifically, MELO employs an architectural separation between data and meta data storage and uses a shared bits pool allocation mechanism to reduce meta data on-chip memory footprint. By only adding in average 23B extra on-chip states for each connection, MELO achieves up to 14.02x throughput while reduces 99% tail FCT by 3.11x compared with go-back-N under certain loss ratio.

References

2008. InfiniBand architecture volume 1, general specifications, release 1.2.1. InfiniBand Trade Association.Google Scholar
2010. Supplement to InfiniBand architecture specification volume 1 release 1.2.2 annex A16: RDMA over converged ethernet (RoCE). InfiniBand Trade Association.Google Scholar
2012. Supplement to InfiniBand architecture specification volume 1 release 1.2.2 annex A17: RoCEv2 (IP routable RoCE). InfiniBand Trade Association.Google Scholar
Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data Center TCP (DCTCP). In Proceedings of the ACM SIGCOMM 2010 Conference (SIGCOMM '10). ACM, New York, NY, USA, 63--74. Google ScholarDigital Library
Remzi H Arpaci-Dusseau and Andrea C Arpaci-Dusseau. 2014. Operating systems: Three easy pieces. Vol. 151. Arpaci-Dusseau Books Wisconsin.Google ScholarDigital Library
Adrian M Caulfield, Eric S Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, and others. 2016. A cloud-scale acceleration architecture. In Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. IEEE, 1--13.Google ScholarDigital Library
Cisco. 2015. Priority Flow Control: Build Reliable Layer 2 Infrastructure. (2015). http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-542809_ns783_Networking_Solutions_White_Paper.html.Google Scholar
Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitu Padhye, and Marina Lipshteyn. 2016. RDMA over Commodity Ethernet at Scale. In Proceedings of the 2016 conference on ACM SIGCOMM 2016 Conference. ACM, 202--215. Google ScholarDigital Library
Chuanxiong Guo, Lihua Yuan, Dong Xiang, Yingnong Dang, Ray Huang, Dave Maltz, Zhaoyi Liu, Vin Wang, Bin Pang, Hua Chen, and others. 2015. Pingmesh: A large-scale system for data center network latency measurement and analysis. ACM SIGCOMM Computer Communication Review 45, 4 (2015), 139--152. Google ScholarDigital Library
Shuihai Hu, Yibo Zhu, Peng Cheng, Chuanxiong Guo, Kun Tan, Jitendra Padhye, and Kai Chen. 2016. Deadlocks in Datacenter Networks: Why Do They Form, and How to Avoid Them. In Proceedings of the 15th ACM Workshop on Hot Topics in Networks. ACM, 92--98. Google ScholarDigital Library
ieee. 2010. 802.1Qbb - Priority-based Flow Control. (2010). http://www.ieee802.org/1/pages/802.1bb.html.Google Scholar
Anuj Kalia, Michael Kaminsky, and David G Andersen. 2016. Design Guidelines for High Performance RDMA Systems. In 2016 USENIX Annual Technical Conference (USENIX ATC 16). Google ScholarDigital Library
Matt Mathis, Jamshid Mahdavi, Sally Floyd, and Allyn Romanow. 1996. TCP selective acknowledgment options. Technical Report. Google Scholar
Mellanox. 2012. Mellanox EN Driver for Linux. (2012). http://www.mellanox.com/page/products_dyn?product_family=27&mtag=linux_driver.Google Scholar
Andrew Putnam, Adrian M Caulfield, Eric S Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth, Gopal Jan, and others. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. international symposium on computer architecture 42, 3 (2014), 13--24. Google ScholarDigital Library
Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion control for large-scale RDMA deployments. In ACM SIGCOMM Computer Communication Review, Vol. 45. ACM, 523--536. Google ScholarDigital Library

Index Terms

Memory Efficient Loss Recovery for Hardware-based Transport in Datacenter
1. Networks
  1. Network types
    1. Data center networks

Recommendations

On using forward error correction for loss recovery in optical burst switched networks

An important issue in optical burst switched (OBS) networks is the loss of bursts at intermediate nodes due to contention. Such contention losses, usually do not mean a situation of congestion. In this paper, we propose for the first time, a loss ...
Read More
Energy efficient Phase Change Memory based main memory for future high performance systems
IGCC '11: Proceedings of the 2011 International Green Computing Conference and Workshops

Phase Change Memory (PCM) has recently attracted a lot of attention as a scalable alternative to DRAM for main memory systems. As the need for high-density memory increases, DRAM has proven to be less attractive from the point of view of scaling and ...
Read More
Burst-tolerant datacenter networks with Vertigo
CoNEXT '21: Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies

Microsecond-scale congestion events, known as microbursts, are a main cause of packet loss and poor application performance in today's datacenters. Given the low network utilization in datacenters, one would expect packet deflection, in-situ re-routing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

APNet '17: Proceedings of the First Asia-Pacific Workshop on Networking
August 2017
127 pages
ISBN:9781450352444
DOI:10.1145/3106989

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 August 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Datacenter networks
Hardware memory
Loss recovery
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 14
  Total Citations
  View Citations
- 191
  Total Downloads
- Downloads (Last 12 months)37
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Memory Efficient Loss Recovery for Hardware-based Transport in Datacenter

APNet '17: Proceedings of the First Asia-Pacific Workshop on Networking

ABSTRACT

References

Cited By

Index Terms

Recommendations

On using forward error correction for loss recovery in optical burst switched networks

Energy efficient Phase Change Memory based main memory for future high performance systems

Burst-tolerant datacenter networks with Vertigo

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Memory Efficient Loss Recovery for Hardware-based Transport in Datacenter

APNet '17: Proceedings of the First Asia-Pacific Workshop on Networking

ABSTRACT

References

Cited By

Index Terms

Recommendations

On using forward error correction for loss recovery in optical burst switched networks

Energy efficient Phase Change Memory based main memory for future high performance systems

Burst-tolerant datacenter networks with Vertigo

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media