Juniper Firefly (vSRX) 12.1X47 chassis cluster under Vmware ESXi5.5
vSRX cluster - Introduction
FireFly Perimeter, or vSRX, is the virtualized version of Juniper SRX branch firewalls. The first version released by Juniper was based on Junos 12.1X46 with limited feature support. At the time of this writing, FireFly evaluation version is based on 12.1X47-D10.
Some of the features supported by Juniper vSRX in X47 are: ALGs, IPSEC, MPLS, BGP, OSPF, ISIS, IDP, AppFW and others. LACP in chassis cluster is not yet supported.
For a full list of features, visit page 18 from the
Firefly Perimeter Getting Started Guide for VMware at
http://www.juniper.net/techpubs/en_US/firefly12.1x47/information-products/pathway-pages/security-virtual-perimeter-vmware-gs-guide-pwp.pdf.
In this post, we will configure a vSRX chassis cluster made of two 12.1X47-D10 FireFly virtual machines running in Vmware ESXi 5.5 hypervisor. Both will be running in the same hypervisor, but be advised this is not best practice. Juniper recommends both FireFly virtual machines to be deployed on separate physical hosts - for obvious reasons.
Before we begin, you should know that building a cluster of VSRX firewalls requires to understand and combine the Juniper chassis cluster reth interface concept as well as the functionality of Vmware standard vSwitch security policies.
Juniper chassis cluster - redundant ethernet (reth) interface concept
A redundant reth interface is a cluster specific virtual interface that is assigned to a cluster redundancy group (only one-to-one mapping is allowed). The reth interface, normally, has at least two members: a
ge- or
xe- interface from each of the cluster nodes. All redundancy groups in an srx chassis cluster (vsrx or physical firewalls) can only be active on one node at a time, thus allowing the physical ge/xe interface on that node, for that specific reth interface, to forward traffic only.
The RETH interface will not use the mac address of any of the member physical interfaces, but it will have it’s own virtual mac address based on the chassis cluster id, interface number and few other parameters. For more details on the reth interfaces mac address format in an srx HA cluster, visit:
http://kb.juniper.net/InfoCenter/index?page=content&id=KB15911. An expressive image representation of how a reth interfaces looks in a network can be seen in the following image:

.
So how does the srx/FireFly cluster notify the neighboring switches which cluster node is active ?When the redundancy group fails over from one node to the other, the new active node sends multiple (configurable iirc) gratuitous arp requests (or replies ? Correct me if I’m wrong).
Vmware standard vSwitch security and settings
Understanding reth concept is key in world of Vmware esxi because, by default, esxi allocates mac addresses to all VMs, for each vmnic and does not allow incoming traffic from the VM with a different mac address. In order to change this behavior to successfully bring up a virtual srx cluster, note two settings that will need to be changed for the vSwitches involved in the setup:
MAC Address Changes: Accept
Forged Transmits: Accept
SRX Cluster - other special interfaces
The control and fabric links are special devices in any SRX based chassis cluster as they have special requirements: an MTU of 9000 bytes and no interference/garbage traffic > they need their special private vlans to ensure a proper cluster functionality. The control link is used for direct control plane communication between the two cluster nodes whereas the fabric link is the dataplane communication. Juniper explains these at
http://www.juniper.net/documentation/en_US/junos11.4/topics/concept/chassis-cluster-fabric-links-understanding.html.
I will list below the steps required (in my case) to deploy the vsrx cluster in ESXi 5.5.
Step 1 - Setting up the control/fabric vSwitch in ESXi
The official guide uses two separate vSwitches for control and fabric links. For testing, I’m using only one with special care: the switch will have an increased MTU of 9000 bytes, will have two ports assigned to two different vlans (10 and 11) to separate them. Here is my esxi vswitch configuration:

Step 2 - Deploy the FireFly virtual machines using the OVF Template
In your vsphere client, go to File->Deploy OVF Template->Browse for the image (junos-vsrx-12.1X47-D10.4-domestic.ova) and hit NEXT.
You will have to choose the names of each FireFly VM. I will use “VM10-vSRX-1” and “VM11-vSRX-2”. The rest of the steps need no change, just use the preset choices.
Step 3 - Setup each vSRX initial networking configuration
.
Following configuration requires a reboot of the VMs.
As with any branch SRX firewall, after cluster is enabled and rebooted, the box stops behaving like an individual firewall and starts behaving like a cluster node. For FireFly this means that:
ge-0/0/0 - will be renamed to fxp0 - out of band management - mapped to the routing engine / control plane and does not go through flow (and security settings don’t apply to it).
ge-0/0/1 - will be renamed to fxp1 - cluster control link.
ge-0/0/2 - (optional) will be used as cluster fabric link.

The figure shows the following virtual machine network interfaces and their mapping to the vSRX ge- interfaces:
Network adapter 1 - ge-0/0/0 - fxp0 - connects to “VM Network” for the management LAN.
Network adapter 2 - ge-0/0/1 - fxp1 - connects to “FF-Control” port on vlan10.
Network adapter 3 - ge-0/0/2 - fab0/fab1 (fab0 on node0, fab1 on node1) - connects to “FF-Fab” port on vlan11.
Another thing to note is that node0 interfaces will numbered on FPC0 of the cluster (ge-0*) and node1 interfaces will be on PIC7 (ge-7*) AFTER the cluster has been rebooted.
Step 4 - vSRX enabling the cluster and initial configuration
This step requires to be performed at the vsrx console in the esxi client.
Since two ge- interfaces will be renamed, they will need remapping at Junos boot time so it is mandatory that the configuration DOES NOT contain ge-0/0/0, ge-0/0/1 nor ge-0/0/2 (ge-0/0/2 will be used in my case as fabric links, but this can be configured).
Activate root password (junos does not allow any commits if root user does not have a password)From the console login as “root”, no password will be required. Start the “cli”, enter and go to configuration mode “conf”.
Code:
[edit]
# set system root-authentication plain-text-password

Note root password has to be at least 6 characters in length.
Preparing the nodes enable the chassis clusterWarning: Console is still required.
From ESXi console, run the following commands for each node individually if you are still in configuration mode:
vSRX-1:
Code:
root@# delete security
root@# delete interfaces
root@# commit and-quit
> set chassis cluster cluster-id 1 node 0 reboot
vSRX-2:
Code:
> conf
root@# delete security
root@# delete interfaces
root@# commit and-quit
> set chassis cluster cluster-id 1 node 1 reboot
Note that any reference to ge-0/0/0 or ge-0/0/1 in the configuration, will render prevent the vsrx to act as a cluster node after reboot because PFE will not be initialized and the following error will be displayed:
Code:
root> show chassis fpc pic-status
error: Could not connect to node0 : No route to host
Step 5 - initial chassis cluster configuration (post reboot)
After reboot, login in both nodes, start the CLI and their prompt will be “{primary:node0}” and “{secondary:node1}”. This means that they negotiated mastership and they are acting as one chassis.
This means that the rest of configuration can be deployed only from one node (preferably primary - node0 in my case) and the changes will be committed to the other node also.
From vSRX-1 console, commit the following configuration:
Code:
root@# set groups node0 system host-name vSRX-1
root@# set groups node0 interfaces fxp0 unit 0 family inet address 10.1.1.10/24
root@# set groups node1 system host-name vSRX-2
root@# set groups node1 interfaces fxp0 unit 0 family inet address 10.1.1.11/24
root@# set apply-groups ${node}
root@# set system services ssh protocol-version v2
root@# commit and-quit
The above creates configuration group for each node, sets the hostname and management interface IP address, instructs both nodes to use the group assigned to it and commits.
Note: If any node has a wrong hostname, check that the system hostname is not configured globally (“# system host-name”). Remove it if it is.
Step 6 - Setting up the fabric interfaces
Code:
{primary:node0}[edit]
root@vSRX-1# set interfaces fab0 fabric-options member-interfaces ge-0/0/2
{primary:node0}[edit]
root@vSRX-1# set interfaces fab1 fabric-options member-interfaces ge-7/0/2
{primary:node0}[edit]
root@vSRX-1# commit
Step 7 - Checking the health state of the cluster interfaces so far
.
I’m interested in the control and fabric link status in the “> show chassis cluster interfaces” status. This output does not indicate physical connection, but rather node liveliness between the nodes (keepalives and heartbeats are sent and received correctly by both nodes).
Code:
root@vSRX-1# run show chassis cluster interfaces
Control link status: Up
Control interfaces:
Index Interface Monitored-Status Internal-SA
0 fxp1 Up Disabled
Fabric link status: Up
Fabric interfaces:
Name Child-interface Status
(Physical/Monitored)
fab0 ge-0/0/2 Up / Up
fab0
fab1 ge-7/0/2 Up / Up
fab1
Redundant-pseudo-interface Information:
Name Status Redundancy-group
lo0 Up 0
Good. It is time for the rest of the article. Before continuing let’s add one more port ge-0/0/3 to each FireFly instance that will be a trunk and that will have access to the physical network as the vSwitch will also contain a physical network card. !!! It is important that both virtual machines have IDENTICAL network configuration.
Step 8 - Add “Network adapter 4” or “ge-0/0/3”/“ge-7/0/3” on both virtual machines
Before starting, power off both FireFly VMs to add extra interface.
In an esxi vSwitch that’s ready to accomodate the vSRX cluster forwarding interfaces and that has a physical nic added, add one more port. In my case this will be a trunk as I will use subinterfaces (IFL) that are vlan tagged.


Power on both VMs.
Step 9 - Enable redundant ethernet interfaces, add redundancy group and wrap up
It is time to set the “reth-count” to reflect our deployment. I will set it to 2, but in fact I only use one reth interface (with ge-0/0/3 and ge-7/0/3 members).
I will create to redundancy groups RG0 (routing engine) and RG1 (data plane 1) and will assign the reth interface to RG1.
Enable usage of reth interfaces in vSRX cluster and configure RG0 and RG1Code:
set chassis cluster reth-count 2
set chassis cluster redundancy-group 0 node 0 priority 200
set chassis cluster redundancy-group 0 node 1 priority 100
set chassis cluster redundancy-group 1 node 0 priority 200
set chassis cluster redundancy-group 1 node 1 priority 100
Configure reth0 with ge-0/0/3 and ge-7/0/3 members as membersCode:
primary:node0}[edit]
root@vSRX-1# show interfaces | display set | match reth0
set interfaces ge-0/0/3 gigether-options redundant-parent reth0
set interfaces ge-7/0/3 gigether-options redundant-parent reth0
set interfaces reth0 vlan-tagging
set interfaces reth0 redundant-ether-options redundancy-group 1
set interfaces reth0 unit 10 vlan-id 10
set interfaces reth0 unit 10 family inet address 172.16.10.1/24
{primary:node0}[edit]
root@vSRX-1# set security zones security-zone LAN1 interfaces reth0.10 host-inbound-traffic protocols ospf
{primary:node0}[edit]
root@vSRX-1# set security zones security-zone LAN1 interfaces reth0.10 host-inbound-traffic system-services ping
{primary:node0}[edit]
root@vSRX-1# commit
Reth0 interface will now terminate a trunk port (All Vlans) and reth0.10 will operate on vlan10 with an IP of 172.16.10.1.
Step 10 - Final checkups
Code:
{primary:node0}[edit]
root@vSRX-1# run show chassis cluster status
Monitor Failure codes:
CS Cold Sync monitoring FL Fabric Connection monitoring
GR GRES monitoring HW Hardware monitoring
IF Interface monitoring IP IP monitoring
LB Loopback monitoring MB Mbuf monitoring
NH Nexthop monitoring NP NPC monitoring
SP SPU monitoring SM Schedule monitoring
CF Config Sync monitoring
Cluster ID: 1
Node Priority Status Preempt Manual Monitor-failures
Redundancy group: 0 , Failover count: 1
node0 200 primary no no None
node1 100 secondary no no None
Redundancy group: 1 , Failover count: 1
node0 200 primary no no None
node1 100 secondary no no None
{primary:node0}[edit]
root@vSRX-1# run show chassis cluster interfaces
Control link status: Up
Control interfaces:
Index Interface Monitored-Status Internal-SA
0 fxp1 Up Disabled
Fabric link status: Up
Fabric interfaces:
Name Child-interface Status
(Physical/Monitored)
fab0 ge-0/0/2 Up / Up
fab0
fab1 ge-7/0/2 Up / Up
fab1
Redundant-ethernet Information:
Name Status Redundancy-group
reth0 Up 1
reth1 Down Not configured
Redundant-pseudo-interface Information:
Name Status Redundancy-group
lo0 Up 0
{primary:node0}[edit]
root@vSRX-1# run show interfaces reth0 terse
Interface Admin Link Proto Local Remote
reth0 up up
reth0.10 up up inet 172.16.10.1/24
reth0.32767 up up
{primary:node0}[edit]
root@vSRX-1# run ping 172.16.10.2
PING 172.16.10.2 (172.16.10.2): 56 data bytes
64 bytes from 172.16.10.2: icmp_seq=0 ttl=128 time=0.916 ms
64 bytes from 172.16.10.2: icmp_seq=1 ttl=128 time=5.446 ms
^C
--- 172.16.10.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.916/3.181/5.446/2.265 ms
{primary:node0}[edit]
root@vSRX-1# run show arp no-resolve
MAC Address Address Interface Flags
00:0c:29:f1:81:6a 10.1.1.11 fxp0.0 none
00:0c:29:2a:32:b0 10.1.1.54 fxp0.0 none
4c:96:14:12:02:30 30.17.0.2 fab0.0 permanent
4c:96:14:11:02:30 30.18.0.1 fab1.0 permanent
00:0c:29:f1:81:74 130.16.0.1 fxp1.0 none
00:0c:29:2a:32:ba 172.16.10.2 reth0.10 none
The above shows that control and fabric links are up. We can manage both nodes through fxp0 management interface and the redundant interface (that forwards traffic) is up and can ping a host in it’s subnet (and learn it’s mac address).
I will not enable interface monitoring for reth0 member interfaces ge-0/0/3 and ge-7/0/3 as in a virtual environment it does not make much sense. However, ip-monitoring would make a lot.
Since it’s 2:30AM, I will let this rest and add further notes if they hit my head.