Have you ever thought how the IP addresses are chosen/selected in icmp time exceeded error messages when you run a traceroute command? Recently I was analyzing an issue and this really made a difference in troubleshooting. I have done the analysis on an SRX firewall and a Linux device and I have got different results. I haven’t really prepared a setup for this but I will try to show this in a sample traceroute output.
root@debian1:~# traceroute -n 18.104.22.168
traceroute to 22.214.171.124 (126.96.36.199), 30 hops max, 60 byte packets
1 192.168.103.1 1.526 ms 1.520 ms 1.531 ms
2 192.168.2.2 9.599 ms 9.584 ms 9.597 ms
This is just a snippet of my traceroute. 192.168.103.1 is actually an SRX device and this IP address is tied to vlan.103 interface to which my packet (UDP traceroute) has entered. What if packet returned to me for example via vlan.104 interface. Would I still see the 192.168.103.1 IP address in the output? Answer is yes according to my tests. What I have noticed is the following on an SRX box if there is asymmetric routing;
- In flow mode SRX: device is sending icmp time exceeded message with source address of vlan.103 through the interface that the packet entered. Regardless of the routing error is sent via the incoming path
- In packet mode SRX: device is sending icmp time exceeded message with the source address of vlan.103 through the vlan.104
As you can see even if the device can follow a different path to you, ICMP time exceeded error message is sourced from the incoming interface. This makes troubleshooting a bit difficult actually as you don’t really understand how the remote device is returning the traffic back to you unless you take packet capture.
What about Linux? Linux’s behaviour is also different. If we put Linux instead of SRX on this topology and packet enters via vlan.103 and return via vlan.104, Linux will send ICMP time exceeded packet through the vlan.104 interface with source address of vlan.104. This means if your UDP packet (traceroute) is forwarded through a Linux device and there is an asymmetric route, you can notice immediately if you know the interface IP assignments as traceroute will display a different IP address then you expect.
To be honest, I have searched several RFCs in order to find if there is any RFC requirement in ICMP time exceeded source address selection but couldn’t find anything. If you know anything, please leave a comment:)
Update: Later I have found the nanog traceroute presentation in which I have found the answer for this mystery.