Slow file transfers must be really bothering everyone. I have a ZyXEL NSA325 NAS device which has a gigabit interface but I am getting extremely low throughput. Unfortunately this has been a problem I think since I bought this device. Now I could finally get hold of time to troubleshoot the issue. Here is my topology I used in testing this scenario.
As per the topology above, my laptop and this NAS device are connected to two ports of this Juniper EX2200 switch. I have enabled jumbo frame on the ports, laptop and NAS device.
Path MTU discovery that is in place today is relying on ICMP based MTU discovery i.e you send an oversize packet which can’t be forwarded by an intermediate host in the path because the next hop link has a lower MTU size, then the source host is notified by this hop which can’t forward this packet. It is this notification that is sent to the source in an ICMP Destination Unreachable “Fragmentation needed and DF set” message but what happens if this ICMP notifications are blocked? Then we have a big problem and sometimes it may be difficult to identify.
So in this post I would like to show the mitigation technique in case ICMPs are blocked in the network. Let’s first see this ICMP block situation and how we can mitigate this problem by using packetization layer MTU discovery method which is explained in RFC4821 “Packetization Layer Path MTU Discovery”
Following is our topology that we carry out the tests.
Let’s first lower the MTU on segment 2. We do this on Host B(LAB1021-R1)
LAB1021-R1>ip link set vlan2201 mtu 1000
LAB1021-R1>ip link show vlan2201
15: vlan2201@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1000 qdisc noqueue state UP mode DEFAULT group default
link/ether 00:0c:29:42:66:60 brd ff:ff:ff:ff:ff:ff
Yes we have a lower MTU now.
TCP sliding window is very crucial concept in understanding how TCP behaves. In order to see how this mechanism works, I have rate limited an HTTP download and observed what happens during this scenario in which we will see reports from Wireshark that [TCP Window Full] and [TCP ZeroWindow]. The aim of this post is to try to show how wireshark understands that Window is full.
We have a web server and a client machine on this setup. We intentionally rate limit the traffic by using wget to allow us investigate this scenario.
root@LAB1021-PC10:~# wget http://10.11.5.2/test.iso --limit-rate=50K
Connecting to 10.11.5.2:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1048576 (1.0M) [application/x-iso9660-image]
Saving to: test.iso
100%[================================================================>] 1,048,576 50.3KB/s in 20s
(50.1 KB/s) - test.iso saved [1048576/1048576]
Van Jacobson is a prominent person in networking, especially for TCP/IP. What I didn’t know was (according to wikipedia) original traceroute was also written by him. As this tool is the swiss knife of a Tech Support Engineer, I would like to share the meaning of some of the outputs. If you have any other error, please do share here to improve the list.
This sample output indicates that network that 10.1.2.4 IP belongs doesn’t exist on the 10.11.1.1 host if you take a packet capture when you see this error, you will see that you receive, ICMP destination unreachable “Network Unreachable” message from this host.
root@LAB1021-PC10:~# traceroute -n 10.1.2.4
traceroute to 10.1.2.4 (10.1.2.4), 30 hops max, 60 byte packets
1 10.11.6.1 0.620 ms 0.522 ms 0.450 ms
2 10.11.1.1 1.260 ms !N 1.205 ms !N * <----
This error however indicates that IP network is available but the individual host 10.11.2.4 can’t be reached. It isn’t available. The last host (10.11.1.1) which is supposed to provide connectivity to the destination device returns ICMP destination unreachable “Host unreachable” to the source host.
root@LAB1021-PC10:~# traceroute -n 10.11.2.4
traceroute to 10.11.2.4 (10.11.2.4), 30 hops max, 60 byte packets
1 10.11.6.1 0.763 ms 0.744 ms 0.993 ms
2 10.11.1.1 1.979 ms 2.017 ms 2.003 ms
3 10.11.1.1 2999.349 ms !H 2999.359 ms !H 2999.397 ms !H <----
This error is received if you are trying some PMTU discovery. Intermediate host which can’t deliver this oversized packet returns ICMP Destination unreachable “Fragmentation needed and DF bit set”
root@LAB1021-PC10:~# traceroute -n 10.11.5.2 -F 1400
traceroute to 10.11.5.2 (10.11.5.2), 30 hops max, 1400 byte packets
1 10.11.6.1 1.574 ms 1.534 ms 1.492 ms
2 10.11.1.1 2.497 ms 2.649 ms 2.570 ms
3 10.11.1.1 2.546 ms !F-1000 2.532 ms !F-1000 2.441 ms !F-1000 <----It informs that Next-hop MTU is 1000 on this host 10.11.1.1
Any other error letter you have seen? Drop your comment here!
Traceroute is a great tool to discover the path a packet traverses in outgoing direction but if you have an MPLS cloud, you may have some unexpected behavior if you don’t do some tweaks. First of all let’s see how traceroute discovers a path when there isn’t any MPLS cloud.
The network above is using IP to route packets and we are running traceroute on GW2 device towards Debian1 device.
root@GW2> traceroute no-resolve 22.214.171.124
traceroute to 126.96.36.199 (188.8.131.52), 30 hops max, 40 byte packets
1 184.108.40.206 19.899 ms 20.021 ms 19.920 ms
2 220.127.116.11 19.890 ms 20.093 ms 19.964 ms
We can clearly see the two hops in our traceroute. IP addresses displayed on the output are from ingress interface of our probe packets. For this traceroute I also took a packet capture on ingress interface of GW1 i.e 18.104.22.168 side.
Junos and Linux traceroute by default use UDP to send probe packets and each hop receives 3 UDP segments.
On this Saturday evening, I have finally completed my work with TCP SACK analysis. This post was in my mind for sometime but now I have done it after building my big local Internet at home. You will also find some stuff about receive segmentation offload, wireshark tips etc. Here is the topology used for TCP SACK tests.
First of all, this big setup isn’t really necessary but I am using this setup for my BGP tests and have found it suitable for a real world scenario. What are we testing?
I couldn’t really find a suitable topic for this post actually but I will try to find answers for the following questions:
- How can we fragment an IP packet manually in scapy
- How does a fragmented packet look like and how the transport layer (TCP/UDP) header is located
- How do we forward fragmented packets, do we reassemle them?
- If we don’t reassemble, can we force reassembly?
First of all a bit of a theory: if an incoming IP packet is to be forwarded to another next hop and the MTU of this new path is smaller than the packet to be transmitted, we must find a way to forward the packet. If the packet has DF (Don’t Fragment) bit on i.e we are instructed not to fragment the packet most probably by the source, then normally we are expected to send an ICMP packet with type “Fragmentation needed” and pray that on the way back to the source no devices block all ICMP type of traffic. Second scenario is that what if the source lets us fragment the packet. Then we need to fragment it and story from now on is about this part of the scenario and the topology we will use is something like below.