Data center TCP (DCTCP)
Top Cited Papers
- 30 August 2010
- proceedings article
- Published by Association for Computing Machinery (ACM)
- Vol. 40 (4) , 63-74
- https://doi.org/10.1145/1851182.1851192
Abstract
Cloud data centers host diverse applications, mixing workloads that require small predictable latency with others requiring large sustained throughput. In this environment, today's state-of-the-art TCP protocol falls short. We present measurements of a 6000 server production cluster and reveal impairments that lead to high application latencies, rooted in TCP's demands on the limited buffer space available in data center switches. For example, bandwidth hungry "background" flows build up queues at the switches, and thus impact the performance of latency sensitive "foreground" traffic. To address these problems, we propose DCTCP, a TCP-like protocol for data center networks. DCTCP leverages Explicit Congestion Notification (ECN) in the network to provide multi-bit feedback to the end hosts. We evaluate DCTCP at 1 and 10Gbps speeds using commodity, shallow buffered switches. We find DCTCP delivers the same or better throughput than TCP, while using 90% less buffer space. Unlike TCP, DCTCP also provides high burst tolerance and low latency for short flows. In handling workloads derived from operational measurements, we found DCTCP enables the applications to handle 10X the current background traffic, without impacting foreground traffic. Further, a 10X increase in foreground traffic does not cause any timeouts, thus largely eliminating incast problems.Keywords
This publication has 21 references indexed in Scilit:
- CUBICACM SIGOPS Operating Systems Review, 2008
- Stability and fairness of explicit congestion control with small buffersACM SIGCOMM Computer Communication Review, 2008
- Practical guide to controlled experiments on the webPublished by Association for Computing Machinery (ACM) ,2007
- One more bit is enoughPublished by Association for Computing Machinery (ACM) ,2005
- Part IIACM SIGCOMM Computer Communication Review, 2005
- Processor Sharing Flows in the InternetPublished by Springer Nature ,2005
- Sizing router buffersPublished by Association for Computing Machinery (ACM) ,2004
- The synchronization of periodic routing messagesIEEE/ACM Transactions on Networking, 1994
- Random early detection gateways for congestion avoidanceIEEE/ACM Transactions on Networking, 1993
- A binary feedback scheme for congestion avoidance in computer networksACM Transactions on Computer Systems, 1990