JNCIP-SEC [ 4 – High Availability ]
Today’s post is about high availability which is one of the topics of jncip-sec exam. This post doesn’t cover everything though as it only reflects my self studies. Let’s get started.
Test Topology
Test Platform: 2 x SRX 210 with JunOS 10.4R6.5
Before starting configuration of my srx 210s for cluster, I must remove some configuration items not to avoid some post configuration errors. In each srx do the followings;
delete system host-name delete vlans delete interfaces vlan delete security delete interfaces ge-0/0/1 unit 0 family ethernet-switching delete interfaces fe-0/0/2 unit 0 family ethernet-switching delete interfaces fe-0/0/3 unit 0 family ethernet-switching delete interfaces fe-0/0/4 unit 0 family ethernet-switching delete interfaces fe-0/0/5 unit 0 family ethernet-switching delete interfaces fe-0/0/6 delete interfaces fe-0/0/7
fe-0/0/6 interface is the management (fxp0) interface and must be removed
fe-0/0/7 interface is the control interface (fxp1) and must also be removed
After this operation make sure there is no ethernet-switching left:
root@srx1# show | match ethernet-switching | count Count: 0 lines [edit] root@srx1#
Looks good so far!
It is time to set cluster and reboot:
root@srx1> set chassis cluster cluster-id 1 node 0 reboot root@srx2> set chassis cluster cluster-id 1 node 1 reboot
After reboot if you check the prompt of srx1, you will see the prompt changes like below;
{hold:node0} root@srx1> {secondary:node0} root@srx1> {primary:node0} root@srx1>
Check cluster status:
root@srx1> show chassis cluster status Cluster ID: 1 Node Priority Status Preempt Manual failover Redundancy group: 0 , Failover count: 1 node0 1 primary no no node1 1 secondary no no
Configure management interfaces on the first node only (srx1):
set groups node0 system host-name srx1 set groups node0 interfaces fxp0 unit 0 family inet address 10.1.1.1/24 set groups node1 system host-name srx2 set groups node1 interfaces fxp0 unit 0 family inet address 10.1.1.2/24 set apply-groups "${node}"
Configure fabric links on the first node only (srx1)
delete interfaces fe-0/0/5.0 set interfaces fab0 fabric-options member-interfaces fe-0/0/5 set interfaces fab1 fabric-options member-interfaces fe-2/0/5 commit
Pay attention: fe-0/0/5 is the 5th interface in srx1 and fe-2/0/5 is the 5th interface in srx2 in SRX210 models.
After commit, config should sync into srx2 node as well.
Now check cluster interfaces status:
root@srx1> show chassis cluster interfaces Control link 0 name: fxp1 Control link status: Up Fabric interfaces: Name Child-interface Status fab0 fe-0/0/5 up fab0 fab1 fe-2/0/5 up fab1 Fabric link status: Up
REDUNDANCY GROUPS
A cluster without an RG is useless. Lets create a redundancy group and test it.
RG0 is used for control plane and RG1 and RG2 will be our service RGs.
{primary:node0}[edit chassis cluster] root@srx1# show reth-count 2; redundancy-group 0 { node 0 priority 100; node 1 priority 99; } redundancy-group 1 { node 0 priority 100; node 1 priority 99; preempt; interface-monitor { ge-0/0/0 weight 255; ge-0/0/1 weight 255; } } redundancy-group 2 { node 0 priority 100; node 1 priority 99; preempt; interface-monitor { ge-0/0/0 weight 255; ge-0/0/1 weight 255; } }
RG1 has node 0 as the primary node since its priority is higher. We enable preempt because of which if the condition which causes failover to secondary node (node 1) is resolved, RG1 will failover back to primary node 0. We also monitor ge-0/0/0 and ge-0/0/1 therefore if any of these links fails, RG1 will failover to node 1.
{primary:node0}[edit interfaces] root@srx1# ge-0/0/0 { gigether-options { redundant-parent reth0; } } ge-2/0/0 { gigether-options { redundant-parent reth0; } } reth0 { redundant-ether-options { redundancy-group 1; } unit 0 { family inet { address 10.2.2.1/24; } } }
{primary:node0}[edit interfaces] root@srx1# ge-0/0/1 { gigether-options { redundant-parent reth1; } } ge-2/0/1 { gigether-options { redundant-parent reth1; } } reth1 { redundant-ether-options { redundancy-group 2; } unit 0 { family inet { address 192.168.0.55/24; } } }
Let’s see interface status once again:
{primary:node0} root@srx1> show chassis cluster interfaces Control link 0 name: fxp1 Control link status: Up Fabric interfaces: Name Child-interface Status fab0 fe-0/0/5 up fab0 fab1 fe-2/0/5 up fab1 Fabric link status: Up Redundant-ethernet Information: Name Status Redundancy-group reth0 Up 1 reth1 Up 2 Interface Monitoring: Interface Weight Status Redundancy-group ge-0/0/1 255 Up 1 ge-0/0/0 255 Up 1 ge-0/0/1 255 Up 2 ge-0/0/0 255 Up 2
I have done a test by unplugging the cable connected to ge-0/0/0 interface and immediately I have seen gratuitous ARP packets.
Manual Failover of RG:
If you need to do a manual fail over, you can use the following;
root@srx1> request chassis cluster failover redundancy-group 1 node 1 node1: -------------------------------------------------------------------------- Initiated manual failover for redundancy group 1 {primary:node0} root@srx1> show chassis cluster status redundancy-group 1 Cluster ID: 1 Node Priority Status Preempt Manual failover Redundancy group: 1 , Failover count: 4 node0 100 secondary yes yes node1 255 primary yes yes
request command increases the priority of node1 to 255 and it becomes the primary node.
SESSION FLOW TEST
After I setup my lab, I started an HTTP session from PC1 towards 163.1.2.224 host and displayed session table;
{primary:node0} root@srx1> show security flow session application http node0: -------------------------------------------------------------------------- Session ID: 468, Policy name: trust-permit/4, State: Active, Timeout: 1798, Valid In: 10.2.2.100/53437 --> 163.1.2.224/80;tcp, If: reth0.0, Pkts: 224, Bytes: 11840 Out: 163.1.2.224/80 --> 192.168.0.55/61056;tcp, If: reth1.0, Pkts: 477, Bytes: 674608 Total sessions: 1 node1: -------------------------------------------------------------------------- Session ID: 138, Policy name: trust-permit/4, State: Backup, Timeout: 14394, Valid In: 10.2.2.100/53437 --> 163.1.2.224/80;tcp, If: reth0.0, Pkts: 0, Bytes: 0 Out: 163.1.2.224/80 --> 192.168.0.55/61056;tcp, If: reth1.0, Pkts: 0, Bytes: 0 Total sessions: 1
Sessions are synchronized between both nodes of which the active one is node0. Packets are entering from redundancy group reth0.0 and leaving at reth1.0. Lets check these interfaces.
root@srx1> show interfaces reth0.0 detail Logical interface reth0.0 (Index 67) (SNMP ifIndex 551) (Generation 132) Flags: SNMP-Traps 0x0 Encapsulation: ENET2 Statistics Packets pps Bytes bps Bundle: Input : 5139 12 509791 5168 Output: 8820 29 10986286 339776 Link: ge-0/0/0.0 Input : 4949 12 490939 5168 Output: 8820 29 10986286 339776 ge-2/0/0.0 Input : 190 0 18852 0 Output: 0 0 0 0 Marker Statistics: Marker Rx Resp Tx Unknown Rx Illegal Rx ge-0/0/0.0 0 0 0 0 ge-2/0/0.0 0 0 0 0
You can immediately see that traffic is flowing through child interface ge-0/0/0.0
Lets unplug the cable from ge-0/0/0 and look at the session table;
{primary:node0} root@srx1> show chassis cluster status redundancy-group 1 Cluster ID: 1 Node Priority Status Preempt Manual failover Redundancy group: 1 , Failover count: 2 node0 0 secondary yes no node1 99 primary yes no {primary:node0} root@srx1> show chassis cluster status redundancy-group 2 Cluster ID: 1 Node Priority Status Preempt Manual failover Redundancy group: 2 , Failover count: 2 node0 0 secondary yes no node1 99 primary yes no {primary:node0} root@srx1> show security flow session application http node0: -------------------------------------------------------------------------- Session ID: 468, Policy name: trust-permit/4, State: Backup, Timeout: 1726, Valid In: 10.2.2.100/53437 --> 163.1.2.224/80;tcp, If: reth0.0, Pkts: 4705, Bytes: 244948 Out: 163.1.2.224/80 --> 192.168.0.55/61056;tcp, If: reth1.0, Pkts: 10749, Bytes: 15258112 Total sessions: 1 node1: -------------------------------------------------------------------------- Session ID: 138, Policy name: trust-permit/4, State: Active, Timeout: 1800, Valid In: 10.2.2.100/53437 --> 163.1.2.224/80;tcp, If: reth0.0, Pkts: 926, Bytes: 48164 Out: 163.1.2.224/80 --> 192.168.0.55/61056;tcp, If: reth1.0, Pkts: 2238, Bytes: 3177960 Total sessions: 1
Unplugging cable moved both redundancy groups to node1 which is what I wanted indeed as we don’t want any asymmetric routing or so. From session table we can see that node1 has now the active sessions.
Redundant Interfaces
One thing to mention is the MAC addresses of redundant interfaces.
{primary:node0} root@srx1> show interfaces reth0 | match Hardware Current address: 00:10:db:ff:10:00, Hardware address: 00:10:db:ff:10:00 {primary:node0} root@srx1> show interfaces reth1 | match Hardware Current address: 00:10:db:ff:10:01, Hardware address: 00:10:db:ff:10:01
As it can be seen there is a particular pattern in assignment of MAC addresses.
Disable SRX Cluster
If you want to disable/remove SRX cluster once you have done with it, here is how to do it;
{primary:node0} root@srx1> set chassis cluster disable reboot Successfully disabled chassis cluster. Going to reboot now
Great website, very informative and helped me with some test before putting this srx650 cluster into productions. Cheers!
I have configured cluster, did all the tests ok.
Now looking at the network topology diagram, I want to know, to create trust second zone would be necessary to create a new redundancy group?
Thats awesomme man. Great work, Thanks
After reading your article , i really found junos security book and CBT nugges more helpful
Cheers and Thanks from Pakistan
Thank you, I am glad to hear this!
Could you help to Create new topic about Quality of Service (QOS) on SRX, Please.
Hi,
To be honest I really don’t like QoS:) It is also one of my weak areas. Maybe in the future but it isn’t in my agenda at the moment.
Love your topics!
How can I edit the secondary node through ssh? I tried to search everywhere but I couldn’t find anything!! I need to reboot a remote secondary node 1
Layla,
You should be able to connect to node1 via its fxp0 configured e.g
10.1.1.2/24 address in my post. Or you can connect to the second node via the command
>request routing-engine login node 1
from the primary node.
If you want to connect via SSH, check my other post for backup-router config :
http://rtoodtoo.net/2013/02/18/error-the-routing-subsystem-is-not-running/
Cheers.
Excellent !
I have tried the simulator lab on virtual device
But when I use the command
root> set chassis cluster cluster-id 1 node 0 reboot
Later
root> configure
warning: Clustering enabled; using private edit
error: shared configuration database modified
Please temporarily use ‘configure shared’ to commit
outstanding changes in the shared database, exit,
and return to configuration mode using ‘configure’
Do you have suggestions for this case ?
Thank you !
Strange that you run into this issue. As advised on the output, you can try “config shared” command and then commit and see what happens.
I have not used it, because it is not available on the system. simulator will not be with this topic ?
Just out of curiosity, how are the Gateway device interfaces configured? On which interface is the gateway IP address 192.168.0.1 configured?
Gateway is is just normal default gateway config David e.g “set routing-options static route 0/0 next-hop 192.168.0.1”
you don’t configure it on the interface.