ARP, GARP and IPv6 neighbor discovery
I would like to focus more on IPv6 on the upcoming posts and I think the best topic to start IPv6 is the discovery phase but before delving into IPv6, I need to write about how address resolution works in IPv4 world. I did read couple of RFCs as well so you may find something that you didn’t know before. I also touch on GARP and share my test results. Let’s start with the outline about what we will see on this post.
- How does ARP work?
- What is GARP and under what conditions we send this packet?
- How does IPv6 neighbor discovery work?
We will use the following topology on these tests
I have used a virtual SRX and an Ubuntu Linux details of which are below.
vSRX
{primary:node0} root@CO-A-1> show version node0: -------------------------------------------------------------------------- Hostname: CO-A-1 Model: firefly-perimeter JUNOS Software Release [12.1X47-D20.7]
Linux
root@vhost3:~# uname -a Linux vhost3 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
How does ARP work?
We know that IP addresses are at a high level for network interface cards to communicate. When routing takes place and egress interface is determined, we need to find to which link layer address we should hand over the frame. For example if ping the SRX IPv4 address 192.168.36.1, we know which IP to reach and also our route table gives us the interface but we can’t just send it without having the knowledge of the Layer 2 address of the destination device. In order to find this MAC address, sender device sends an ARP request to which the device which has the IP 192.168.36.1 assigned responds. Let’s do a test without much talking.
First we check if we have any entry in our ARP cache.
root@LAB1020-PC1:~# arp -n
We have no entry. Now send an ICMP echo towards vSRX.
root@LAB1020-PC1:~# ping 192.168.36.1 -c 1 PING 192.168.36.1 (192.168.36.1) 56(84) bytes of data. 64 bytes from 192.168.36.1: icmp_seq=1 ttl=64 time=1.55 ms --- 192.168.36.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.558/1.558/1.558/0.000 ms
Now check the ARP cache again.
root@LAB1020-PC1:~# arp -n Address HWtype HWaddress Flags Mask Iface 192.168.36.1 ether 00:10:db:ff:30:01 C vlan1125
Yes we have got the ARP cache populated with SRX’s MAC address.
Now check arp cache on vSRX.
{primary:node0} root@CO-A-1> show arp no-resolve expiration-time | match 192.168.36.2 00:0c:29:42:66:60 192.168.36.2 reth1.1125 none 1232
Yes it is also populated. Our target device(vSRX) also updates its ARP cache with the sender’s MAC address that it sees in the ARP request. Take a look at the packet capture of this ARP request.
Destination MAC address is broadcast as we don’t know which MAC address target device has but response is sent to us directly from the destination.
Now the question is:
Are ARP requests sent to the broadcast address all the time?
The answer is No. When I have done this test sometime later I have seen that Linux host is sending an ARP request directly to the unicast MAC address instead of broadcast and the answer of this behavior is on “RFC1122 Requirements for Internet hosts” section 2.3.2.1. RFC states that “The ARP specification suggests but does not require a timeout mechanism to invalidate cache entries when hosts change their Ethernet addresses” for this reason we have a Unicast Poll method which is described as below.
(2) Unicast Poll -- Actively poll the remote host by periodically sending a point-to-point ARP Request to it, and delete the entry if no ARP Reply is received from N successive polls. Again, the timeout should be on the order of a minute, and typically N is 2.
Now I am doing another test. I have cleared all ARP caches i.e hosts have no MAC entry anymore. I will ping PC2 from PC1 device i.e PC1 device will first issue a broadcast ARP request and other two namely PC2 and vSRX receive this request but vSRX will populate its ARP cache?
root@LAB1020-PC2:~# arp -n Address HWtype HWaddress Flags Mask Iface 192.168.36.2 ether 00:0c:29:42:66:60 C vlan1125 {primary:node0} root@CO-A-1> show arp no-resolve expiration-time | match 192.168.36.2 {primary:node0}
What do we make of this? vSRX actually received the ARP request but didn’t update its cache. Actually this is a normal behavior but if you want to change this behavior there is also a knob called “passive-learning” if you clear the caches and set the following command vSRX, you will see that it is updating ARP cache even with this so-called third party ARP.
set system arp passive-learning
What is GARP and under what conditions we send this packet?
GARP stands for Gratuitous ARP but to me it doesn’t make sense. When you look up the meaning of “Gratuitous”, you find “Without reason, unjustified” however it has a very good reason:) the best example is high availability. Taking SRX chassis cluster as an example, if your rethX interface fails over from one node to another which physically means your virtual MAC e.g “00:10:db:ff:30:01” is no longer available on the former port. You have to inform connected Layer 2 devices that your MAC has moved to the new node otherwise you will blackhole the traffic.
You can see an example of a GARP packet below.
This GARP messages doesn’t belong to a redundancy group failover though but an IP address modification. I will come to that but first we need to understand that GARP is an ARP packet Sender and Target IP address fields are both set to the sender device’s IP. Since we want to inform folks in the segment that we are here with this IP and MAC but nothing else.
Now on which conditions we send a GARP. I did several modifications in the interface configuration and found several of them. As it is virtual SRX, it is very easy to do a packet capture and see what happens on the wire. Here is the list, you will like it!
- IP address is modified or added
- Deactivate/Activate family inet
- When vlan-id is modified, GARP is sent on the new vlan 5 times
- When an interface is disabled/enabled, GARP is sent 4 times. Likewise if the node is rebooted, once the interface is UP, 4 GARP packets are sent
- When a redundancy group fails over to the other node, 5 GARP packets are sent on every vlan. (This is also true if an unexpected failover occurs e.g node1 is down)
How does IPv6 neighbor discovery work?
Designers of IPv6 have taken slightly different way in address resolution. When you look at the ARP packet, you can see that there is protocol field, protocol size etc i.e IPv6->MAC resolution could be done via ARP but we have ICMPv6 instead of ARP in IPv6.
First assign our IPv6 address on PC1 and vSRX.
Make sure IPv6 flow mode is enabled. This requires reboot.
set security forwarding-options family inet6 mode flow-based set interfaces reth1 unit 1125 family inet6 address 2015:1020::1/64
Needless to say interface must be assigned to a security zone.
Setting IPv6 PC1
ip addr add 2015:1020::2/64 dev vlan1125
Now it is ping time.
root@LAB1020-PC1:~# ping6 2015:1020::1 -c 1 PING 2015:1020::1(2015:1020::1) 56 data bytes 64 bytes from 2015:1020::1: icmp_seq=1 ttl=64 time=3.48 ms --- 2015:1020::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 3.486/3.486/3.486/0.000 ms
Heyyy, we pinged our IPv6 destination. Now this time we check IPv6 neighbor table.
root@LAB1020-PC1:~# ip -6 neigh 2015:1020::1 dev vlan1125 lladdr 00:10:db:ff:30:01 router STALE <---MAC we have learned
We see that we have learned the MAC address. Now let's look what happens under the hood.
In IPv6 world, this request is called "Neighbor Solicitation" It is ICMPv6 Type 135. On the request we put Sender link-layer address and Target IPv6 address. Here we can focus on two more addresses too.
Ethernet Destination address: 33:33:ff:00:00:01
IPv6 Destination address: ff02::1:ff00:1
If you check http://tools.ietf.org/html/rfc7042#section-2.3.1, you will see that any MAC address that starts with 33:33 are used for IPv6 multicast address. To see how it is derived see rfc2464
So our IPv6 destination address is a multicast address. It is called Solicited Node Multicast Address. To see how it is derived see wiki
By the way you can see the multicast address of Linux as below.
root@LAB1020-PC1:~# ip -6 maddr 1: lo inet6 ff02::1 inet6 ff01::1 25: vlan1125 inet6 ff02::1:ff42:6660 inet6 ff02::1:ff00:2 <----- inet6 ff02::1 inet6 ff01::1
It isn't very difficult to understand I believe. It is slightly different than ARP packet.
Now it is time to look at the response i.e Neighbor Advertisement
Now here we can see that Target's link layer has been sent to the Sender device. I have also marked the flags as they also deserve a word. Response is received with couple of flags turned on. You can check all of their meanings on Neighbor Discovery for IP version 6 (IPv6)
I am just copying from RFC what Router flag is
R Router flag. When set, the R-bit indicates that the sender is a router. The R-bit is used by Neighbor Unreachability Detection to detect a router that changes to a host.
Actually not only this RFC but other RFCs which update this RFC at 5942, 6980, 7048, 7527, 7559 should be read. I hope I will also do.
Last but not least, when you assign an IPv6 address to an interface, after DaD (Duplicate Address Detection) meachanism, IPv6 address may not be operational. To make sure IP is operational
Check the following output
{primary:node0} root@CO-A-1> show interfaces reth1.1125 Logical interface reth1.1125 (Index 121) (SNMP ifIndex 590) Description: LAB1020 Flags: SNMP-Traps 0x4000 VLAN-Tag [ 0x8100.1125 ] Encapsulation: ENET2 Statistics Packets pps Bytes bps Bundle: Input : 31 0 2108 0 Output: 26 0 2056 0 Security: Zone: LAB1020-LAN Allowed host-inbound traffic : bootp dns dhcp finger ftp tftp ident-reset http https ike netconf ping reverse-telnet reverse-ssh rlogin rpm rsh snmp snmp-trap ssh telnet traceroute xnm-clear-text xnm-ssl lsping ntp sip dhcpv6 r2cp Protocol inet, MTU: 1500 Flags: Sendbcast-pkt-to-re, Is-Primary Addresses, Flags: Is-Default Is-Preferred Is-Primary Destination: 192.168.36/24, Local: 192.168.36.1, Broadcast: 192.168.36.255 Protocol inet6, MTU: 1500 Flags: Is-Primary Addresses, Flags: Is-Default Is-Preferred Is-Primary <-----Preferred Destination: 2015:1020::/64, Local: 2015:1020::1 Addresses, Flags: Is-Preferred Destination: fe80::/64, Local: fe80::210:db04:65ff:3001
If you don't see this Preferred flag but instead Tentative then there is something wrong but what is tentative?
Here is the quote from RFC2462
tentative address - an address whose uniqueness on a link is being verified, prior to its assignment to an interface. A tentative address is not considered assigned to an interface in the usual sense. An interface discards received packets addressed to a tentative address, but accepts Neighbor Discovery packets related to Duplicate Address Detection for the tentative address.
If you are sure that there isn't any duplicate address you can try disabling DaD too as below
set interfaces reth1.1125 family inet6 dad-disable
It was really a long post but I had to write this. I will also write about DNS64 and NAT64 when I get round to it. This is also something I would like to document. As my little one is on vacation now, I believe I will have more time to write:-)
Genco.