Effect of TCP SACK on throughput

On this Saturday evening, I have finally completed my work with TCP SACK analysis. This post was in my mind for sometime but now I have done it after building my big local Internet at home. You will also find some stuff about receive segmentation offload, wireshark tips etc. Here is the topology used for TCP SACK tests.

tcp_sack_bgp_setupFirst of all, this big setup isn’t really necessary but I am using this setup for my BGP tests and have found it suitable for a real world scenario. What are we testing?

We are downloading a wireshark.exe file around ~30MB from debian1 (144.2.3.2) webserver to client PC hostE(60.60.60.60) simply by wget ;

root@hostE:~# wget http://144.2.3.2/wireshark.exe

under three different states.

  • when there is no packet loss, SACK enabled
  • When there is packet loss, TCP SACK option is enabled on hostE
  • When there is packet loss, TCP SACK option is disabled on hostE

For the completness of the post, I would like to write a few bits about ACKnowledgement and SACK in TCP as well.
An ACK segment in TCP means that the party sending the segment has received all bytes up to the number set in ACK field but not including. Here is an example wireshark output:

tcp_ackOn this segment, we have received all bytes up to 102809 (note: relative numbers used for simplicity) This is a cumulative number which means once remote peer receives this ACK, it will remove segments below this byte waiting in its retransmission queue. Hang on a second which retransmission queue? When a host sends a segment, it puts a copy of this segment in its retransmission queue in case it is lost. Once it is acknowledged, copy is removed from the queue.
So far everything seems to be good. Let’s assume that Debian1 host sends 3 TCP segments and 2nd one is lost.

TCP_ACK_cumulativeWhat will the hostE do? It can definitely ACK 1st segment but what about 3rd? If we ACK 3rd, Debian1 will assume all segments i.e 1,2,3 have been received and remove them from the retransmission queue. That is why hostE can’t ACK 3rd segment. Here TCP SACK option comes into play. When hostE acknowledges 1st segment, it piggybacks the SACK information in ACK segment i.e it informs Debian1 that “Hey peer, I have received 1st segment but I acknowledge 3rd segment in SACK option so not cumulatively, hence I have missing bytes” Debian1 tags the segments for this SACK byte range in retransmission queue so that I doesn’t have to send them again as they already arrived.
Now what do we do. HostE keeps asking for the 2nd segment and after I think 3-4th duplicate ACK, Debian1 realizes that segment has been lost and without waiting for the retransmission timeout, fast retransmission triggers and 2nd segment is sent. After this, hostE can now finally ACK 3rd segment cumulatively, after which Debian1 removes the segment in its retranmission queue as it doesn’t have to send it.  So far I have attached my sample drawings. Let’s see a wireshark capture what happens when there is packet loss

packet_loss_tcp_sackNow, focus on packet number 168 Wireshark says previous segment was not captured. Actually it wasn’t captured, as it got lost in transmission. That previous segment never arrived to 60.60.60.60 host. Because the packet has been lost, we keep asking for the missing segment on packet 169,171,173 again and again and informing the 144.2.3.2 server that we have got everything up to 188241 bytes. However at this point, 144.2.3.2 doesn’t really care about this much as it still has window to send more bytes because of which it keeps sending. If we zoom in the packet no 170

tcp_sack_2we see that we have received bytes bigger than missing one 188241, as Debian1 is still transmitting. Now we are zooming in packet 171 duplicate ACK which contains SACK byte range.

tcp_sack_3 Heyyy, here it is. We inform Debian1 web server that we have received this range of bytes.

Now after this SACK troubleshooting, I would like to go back to my test cases I have listed above but for wireshark not to confuse us with oversized, gigantic IP packet sizes, we need to turn off “Generic Receive Offload” on HostE. For more details you can check my other post to see why we really need this.

root@hostE:~# ethtool -K eth1 gro off

1) No packet loss, SACK enabled

On the topology I have drawn above, AS8500 transit carrier consists of 8 routers and there is asymmetric routing involved as it can be seen below. I did this in purpose but not exactly related to this test.

root@debian1:~# traceroute -n 60.60.60.60
traceroute to 60.60.60.60 (60.60.60.60), 30 hops max, 60 byte packets
 1  144.2.3.1  9.950 ms  9.943 ms  9.936 ms
 2  212.6.1.2  14.878 ms  14.885 ms  14.889 ms
 3  212.6.1.10  24.865 ms  24.872 ms  24.868 ms
 4  172.40.1.2  24.808 ms  24.817 ms  24.810 ms
 5  31.1.1.3  44.659 ms  44.667 ms  44.662 ms
 6  60.60.60.60  29.771 ms  19.669 ms  24.457 ms
root@hostE:~# traceroute -n 144.2.3.2
traceroute to 144.2.3.2 (144.2.3.2), 30 hops max, 60 byte packets
 1  60.60.60.1  2.588 ms  2.583 ms  2.574 ms
 2  98.1.1.1  17.284 ms  17.278 ms 31.1.1.1  17.245 ms
 3  24.1.1.1  17.167 ms  17.214 ms 172.40.1.1  17.199 ms
 4  87.1.1.1  32.287 ms  32.282 ms  32.346 ms
 5  144.2.3.2  32.078 ms 192.168.196.1  31.871 ms 144.2.3.2  32.098 ms

When there is no packet loss, wireshark.exe file is fetched at around 4-5 seconds.

root@hostE:~# wget http://144.2.3.2/wireshark.exe
--2015-01-31 20:07:01--  http://144.2.3.2/wireshark.exe
Connecting to 144.2.3.2:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 29826488 (28M) [application/x-msdos-program]
Saving to: `wireshark.exe.4'

100%[================================================================================>] 29,826,488  7.75M/s   in 4.6s    

2015-01-31 20:07:06 (6.21 MB/s) - `wireshark.exe.4' saved [29826488/29826488]

On this test, presence of SACK doesn’t make any difference anyway:

root@hostE:~# sysctl net.ipv4.tcp_sack
net.ipv4.tcp_sack = 1

2) Packet loss, SACK enabled
Now we need to introduce packet loss. This is very easy with Linux’s advanced net emulation so we add %1 of packet loss on Debian1

root@debian1:~# tc qdisc add dev eth1.956 root netem loss 1%

Packet loss is activated, now fetch the file!

root@hostE:~# wget http://144.2.3.2/wireshark.exe
--2015-01-31 20:48:50--  http://144.2.3.2/wireshark.exe
Connecting to 144.2.3.2:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 29826488 (28M) [application/x-msdos-program]
Saving to: `wireshark.exe.8'

100%[========================================================================================================================>] 29,826,488   232K/s   in 2m 13s

The same file download duration jumped from 5 seconds to 2mins and13secs with only %1 packet loss. Fair enough but this isn’t what we are trying to find. I have tested the same download again and again and on average all of them are above 2mins.

3) Packet loss, SACK disabled
Now we are disabling SACK feature

root@hostE:~# sysctl net.ipv4.tcp_sack=0
net.ipv4.tcp_sack = 0
root@hostE:~# wget http://144.2.3.2/wireshark.exe
--2015-01-31 21:08:15--  http://144.2.3.2/wireshark.exe
Connecting to 144.2.3.2:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 29826488 (28M) [application/x-msdos-program]
Saving to: `wireshark.exe.15'

100%[========================================================================================================================>] 29,826,488   370K/s   in 1m 43s  

2015-01-31 21:09:58 (282 KB/s) - `wireshark.exe.15' saved [29826488/29826488]

Now download duration is 1 min, 43 secs. I have downloaded the same file as I did for the previous test again and again. On average, it is below 2mins.

Still these results don’t prove much but apparently SACK doesn’t bring much considering a topology like I have. If you have a satellite connection in which you have small MTU and high latency constraints, it may be beneficial though as you don’t resend the packets on a link latencies are above 500ms.

By the way, during investigation wonderful wireshark filter “tcp.analysis.lost_segment” helped me a lot! Look what it does:

wireshark_lost_segmentsBy using this filter, I could analyse the packet loss amount and could compare how many packets I lost in each test.

Finally I have reached the end of this post. It was a simple test to show how SACK works practically. Please do let me know if you have any corrections/feedback.

About: rtoodtoo

Worked for more than 10 years as a Network/Support Engineer and also interested in Python, Linux, Security and SD-WAN // JNCIE-SEC #223 / RHCE / PCNSE


6 thoughts on “Effect of TCP SACK on throughput”

  1. Great post! Very good dive into networking, tcp, packets, etc. I liked this one a lot especially the Linux network emu command.

    1. Thanks for your feedback Joe. Linux is great at doing a lot of stuff. I wonder what we would do without it!

    1. There is no tool Jep. They are just vSRX firewalls on ESX and I made the topology manually.

  2. Interesting article, thanks ! So basically, SACK shines at high latency bad connections, but for good low latency connections its just a slight overhead ?

Leave a Reply to jepCancel reply

Discover more from RtoDto.net

Subscribe now to keep reading and get access to the full archive.

Continue reading