JNCIP-SEC [ 4 – High Availability ]

rtoodtoo jncip-sec September 9, 2011

Today’s post is about high availability which is one of the topics of jncip-sec exam. This post doesn’t cover everything though as it only reflects my self studies. Let’s get started.
Test Topology

Test Platform: 2 x SRX 210 with JunOS 10.4R6.5
Before starting configuration of my srx 210s for cluster, I must remove some configuration items not to avoid some post configuration errors. In each srx do the followings;

delete system host-name
delete vlans
delete interfaces vlan
delete security
delete interfaces ge-0/0/1 unit 0 family ethernet-switching
delete interfaces fe-0/0/2 unit 0 family ethernet-switching
delete interfaces fe-0/0/3 unit 0 family ethernet-switching
delete interfaces fe-0/0/4 unit 0 family ethernet-switching
delete interfaces fe-0/0/5 unit 0 family ethernet-switching
delete interfaces fe-0/0/6
delete interfaces fe-0/0/7

fe-0/0/6 interface is the management (fxp0) interface and must be removed
fe-0/0/7 interface is the control interface (fxp1) and must also be removed
After this operation make sure there is no ethernet-switching left:

root@srx1# show | match ethernet-switching | count
Count: 0 lines
[edit]
root@srx1#

Looks good so far!
It is time to set cluster and reboot:

root@srx1> set chassis cluster cluster-id 1 node 0 reboot
root@srx2> set chassis cluster cluster-id 1 node 1 reboot

After reboot if you check the prompt of srx1, you will see the prompt changes like below;

{hold:node0}
root@srx1>
{secondary:node0}
root@srx1>
{primary:node0}
root@srx1>

Check cluster status:

root@srx1> show chassis cluster status
Cluster ID: 1
Node                  Priority          Status    Preempt  Manual failover

Redundancy group: 0 , Failover count: 1
    node0                   1           primary        no       no
    node1                   1           secondary      no       no

Configure management interfaces on the first node only (srx1):

set groups node0 system host-name srx1
set groups node0 interfaces fxp0 unit 0 family inet address 10.1.1.1/24
set groups node1 system host-name srx2
set groups node1 interfaces fxp0 unit 0 family inet address 10.1.1.2/24
set apply-groups "${node}"

Configure fabric links on the first node only (srx1)

delete interfaces fe-0/0/5.0
set interfaces fab0 fabric-options member-interfaces fe-0/0/5
set interfaces fab1 fabric-options member-interfaces fe-2/0/5
commit

Pay attention: fe-0/0/5 is the 5th interface in srx1 and fe-2/0/5 is the 5th interface in srx2 in SRX210 models.
After commit, config should sync into srx2 node as well.
Now check cluster interfaces status:

root@srx1> show chassis cluster interfaces
Control link 0 name: fxp1
Control link status: Up

Fabric interfaces:
    Name   Child-interface   Status
 fab0      fe-0/0/5          up
fab0
fab1       fe-2/0/5          up
fab1
Fabric link status: Up

REDUNDANCY GROUPS
A cluster without an RG is useless. Lets create a redundancy group and test it.
RG0 is used for control plane and RG1 and RG2 will be our service RGs.


{primary:node0}[edit chassis cluster]
root@srx1# show
reth-count 2;
redundancy-group 0 {
    node 0 priority 100;
    node 1 priority 99;
}
redundancy-group 1 {
    node 0 priority 100;
    node 1 priority 99;
    preempt;
    interface-monitor {
        ge-0/0/0 weight 255;
        ge-0/0/1 weight 255;
    }
}
redundancy-group 2 {
    node 0 priority 100;
    node 1 priority 99;
    preempt;
    interface-monitor {
        ge-0/0/0 weight 255;
        ge-0/0/1 weight 255;
    }
}

RG1 has node 0 as the primary node since its priority is higher. We enable preempt because of which if the condition which causes failover to secondary node (node 1) is resolved, RG1 will failover back to primary node 0. We also monitor ge-0/0/0 and ge-0/0/1 therefore if any of these links fails, RG1 will failover to node 1.

{primary:node0}[edit interfaces]
root@srx1#
ge-0/0/0 {
    gigether-options {
        redundant-parent reth0;
    }
}

ge-2/0/0 {
    gigether-options {
        redundant-parent reth0;
    }
}

reth0 {
    redundant-ether-options {
        redundancy-group 1;
    }
    unit 0 {
        family inet {
            address 10.2.2.1/24;
        }
    }
}

{primary:node0}[edit interfaces]
root@srx1#
ge-0/0/1 {
    gigether-options {
        redundant-parent reth1;
    }
}
ge-2/0/1 {
    gigether-options {
        redundant-parent reth1;
    }
}
reth1 {
    redundant-ether-options {
        redundancy-group 2;
    }
    unit 0 {
        family inet {
            address 192.168.0.55/24;
        }
    }
}

Let’s see interface status once again:

{primary:node0}
root@srx1> show chassis cluster interfaces
Control link 0 name: fxp1
Control link status: Up

Fabric interfaces:
    Name   Child-interface   Status
 fab0      fe-0/0/5          up
fab0
fab1       fe-2/0/5          up
fab1
Fabric link status: Up

Redundant-ethernet Information:
    Name         Status      Redundancy-group
    reth0        Up          1
    reth1        Up          2

Interface Monitoring:
    Interface         Weight    Status    Redundancy-group
    ge-0/0/1          255       Up        1
    ge-0/0/0          255       Up        1
    ge-0/0/1          255       Up        2
    ge-0/0/0          255       Up        2

I have done a test by unplugging the cable connected to ge-0/0/0 interface and immediately I have seen gratuitous ARP packets.

Manual Failover of RG:

If you need to do a manual fail over, you can use the following;

root@srx1> request chassis cluster failover redundancy-group 1 node 1
node1:
--------------------------------------------------------------------------
Initiated manual failover for redundancy group 1

{primary:node0}
root@srx1> show chassis cluster status redundancy-group 1
Cluster ID: 1
Node                  Priority          Status    Preempt  Manual failover

Redundancy group: 1 , Failover count: 4
    node0                   100         secondary      yes      yes
    node1                   255         primary        yes      yes

request command increases the priority of node1 to 255 and it becomes the primary node.

SESSION FLOW TEST

After I setup my lab, I started an HTTP session from PC1 towards 163.1.2.224 host and displayed session table;

{primary:node0}
root@srx1> show security flow session application http
node0:
--------------------------------------------------------------------------

Session ID: 468, Policy name: trust-permit/4, State: Active, Timeout: 1798, Valid
  In: 10.2.2.100/53437 --> 163.1.2.224/80;tcp, If: reth0.0, Pkts: 224, Bytes: 11840
  Out: 163.1.2.224/80 --> 192.168.0.55/61056;tcp, If: reth1.0, Pkts: 477, Bytes: 674608
Total sessions: 1

node1:
--------------------------------------------------------------------------

Session ID: 138, Policy name: trust-permit/4, State: Backup, Timeout: 14394, Valid
  In: 10.2.2.100/53437 --> 163.1.2.224/80;tcp, If: reth0.0, Pkts: 0, Bytes: 0
  Out: 163.1.2.224/80 --> 192.168.0.55/61056;tcp, If: reth1.0, Pkts: 0, Bytes: 0
Total sessions: 1

Sessions are synchronized between both nodes of which the active one is node0. Packets are entering from redundancy group reth0.0 and leaving at reth1.0. Lets check these interfaces.

root@srx1> show interfaces reth0.0 detail
  Logical interface reth0.0 (Index 67) (SNMP ifIndex 551) (Generation 132)
    Flags: SNMP-Traps 0x0 Encapsulation: ENET2
    Statistics        Packets        pps         Bytes          bps
    Bundle:
        Input :          5139         12        509791         5168
        Output:          8820         29      10986286       339776
    Link:
      ge-0/0/0.0
        Input :          4949         12        490939         5168
        Output:          8820         29      10986286       339776
      ge-2/0/0.0
        Input :           190          0         18852            0
        Output:             0          0             0            0
    Marker Statistics:   Marker Rx     Resp Tx   Unknown Rx   Illegal Rx
      ge-0/0/0.0                 0           0            0            0
      ge-2/0/0.0                 0           0            0            0

You can immediately see that traffic is flowing through child interface ge-0/0/0.0
Lets unplug the cable from ge-0/0/0 and look at the session table;

{primary:node0}
root@srx1> show chassis cluster status redundancy-group 1
Cluster ID: 1
Node                  Priority          Status    Preempt  Manual failover

Redundancy group: 1 , Failover count: 2
    node0                   0           secondary      yes      no
    node1                   99          primary        yes      no

{primary:node0}
root@srx1> show chassis cluster status redundancy-group 2
Cluster ID: 1
Node                  Priority          Status    Preempt  Manual failover

Redundancy group: 2 , Failover count: 2
    node0                   0           secondary      yes      no
    node1                   99          primary        yes      no

{primary:node0}
root@srx1> show security flow session application http
node0:
--------------------------------------------------------------------------

Session ID: 468, Policy name: trust-permit/4, State: Backup, Timeout: 1726, Valid
  In: 10.2.2.100/53437 --> 163.1.2.224/80;tcp, If: reth0.0, Pkts: 4705, Bytes: 244948
  Out: 163.1.2.224/80 --> 192.168.0.55/61056;tcp, If: reth1.0, Pkts: 10749, Bytes: 15258112
Total sessions: 1

node1:
--------------------------------------------------------------------------

Session ID: 138, Policy name: trust-permit/4, State: Active, Timeout: 1800, Valid
  In: 10.2.2.100/53437 --> 163.1.2.224/80;tcp, If: reth0.0, Pkts: 926, Bytes: 48164
  Out: 163.1.2.224/80 --> 192.168.0.55/61056;tcp, If: reth1.0, Pkts: 2238, Bytes: 3177960
Total sessions: 1

Unplugging cable moved both redundancy groups to node1 which is what I wanted indeed as we don’t want any asymmetric routing or so. From session table we can see that node1 has now the active sessions.

Redundant Interfaces

One thing to mention is the MAC addresses of redundant interfaces.

{primary:node0}
root@srx1> show interfaces reth0 | match Hardware
  Current address: 00:10:db:ff:10:00, Hardware address: 00:10:db:ff:10:00

{primary:node0}
root@srx1> show interfaces reth1 | match Hardware
  Current address: 00:10:db:ff:10:01, Hardware address: 00:10:db:ff:10:01

As it can be seen there is a particular pattern in assignment of MAC addresses.

Disable SRX Cluster
If you want to disable/remove SRX cluster once you have done with it, here is how to do it;

{primary:node0}
root@srx1> set chassis cluster disable reboot
Successfully disabled chassis cluster. Going to reboot now

About: rtoodtoo

Worked for more than 10 years as a Network/Support Engineer and also interested in Python, Linux, Security and SD-WAN // JNCIE-SEC #223 / RHCE / PCNSE

14 thoughts on “JNCIP-SEC [ 4 – High Availability ]”

Derrick S says:

2011/11/28 at 1:04 am

Great website, very informative and helped me with some test before putting this srx650 cluster into productions. Cheers!

Reply
andres says:

2012/02/17 at 12:18 am

I have configured cluster, did all the tests ok.
Now looking at the network topology diagram, I want to know, to create trust second zone would be necessary to create a new redundancy group?

Reply
muhdsid says:

2013/03/26 at 5:23 pm

Thats awesomme man. Great work, Thanks

Reply
zeb says:

2013/07/23 at 11:08 pm

After reading your article , i really found junos security book and CBT nugges more helpful

Cheers and Thanks from Pakistan

Reply
1. rtoodtoo says:
  
  2013/07/24 at 9:53 am
  
  Thank you, I am glad to hear this!
  
  Reply
Arifyanto says:

2014/01/10 at 9:16 am

Could you help to Create new topic about Quality of Service (QOS) on SRX, Please.

Reply
1. rtoodtoo says:
  
  2014/01/10 at 5:58 pm
  
  Hi,
  To be honest I really don’t like QoS:) It is also one of my weak areas. Maybe in the future but it isn’t in my agenda at the moment.
  
  Reply
layla says:

2014/03/12 at 1:38 pm

Love your topics!

How can I edit the secondary node through ssh? I tried to search everywhere but I couldn’t find anything!! I need to reboot a remote secondary node 1

Reply
1. rtoodtoo says:
  
  2014/03/12 at 1:50 pm
  
  Layla,
  You should be able to connect to node1 via its fxp0 configured e.g
  
  10.1.1.2/24 address in my post. Or you can connect to the second node via the command
  
  >request routing-engine login node 1
  
  from the primary node.
  
  If you want to connect via SSH, check my other post for backup-router config :
  
  http://rtoodtoo.net/2013/02/18/error-the-routing-subsystem-is-not-running/
  
  Cheers.
  
  Reply
hiepnh says:

2015/01/31 at 3:59 pm

Excellent !
I have tried the simulator lab on virtual device
But when I use the command
root> set chassis cluster cluster-id 1 node 0 reboot
Later

root> configure
warning: Clustering enabled; using private edit
error: shared configuration database modified

Please temporarily use ‘configure shared’ to commit
outstanding changes in the shared database, exit,
and return to configuration mode using ‘configure’

Do you have suggestions for this case ?
Thank you !

Reply
1. rtoodtoo says:
  
  2015/01/31 at 7:31 pm
  
  Strange that you run into this issue. As advised on the output, you can try “config shared” command and then commit and see what happens.
  
  Reply
  1. hiepnh says:
    
    2015/01/31 at 11:10 pm
    
    I have not used it, because it is not available on the system. simulator will not be with this topic ?
    
    Reply
David says:

2015/03/25 at 8:40 am

Just out of curiosity, how are the Gateway device interfaces configured? On which interface is the gateway IP address 192.168.0.1 configured?

Reply
1. rtoodtoo says:
  
  2015/03/26 at 9:48 pm
  
  Gateway is is just normal default gateway config David e.g “set routing-options static route 0/0 next-hop 192.168.0.1”
  you don’t configure it on the interface.
  
  Reply