This blog was co-authored by Palanivenkatesan Murugan
Since the inception of Virtual SAN Stretched Cluster many current and potential Virtual SAN users have grappled with how to validate Stretched Cluster given the high cost and difficulty with setting multiple networks across geographically distributed sites without impacting production environments. The purpose of this blog is to address these concerns.
Within the Virtual SAN Product Enablement team, we test and validate many different customer scenarios using various Business Critical Applications (BCAs) as well as industry standard test tools. We share these results through our Reference Architectures. In our testing, we do not normally have the luxury of setting up cross-site Virtual SAN Stretched Clusters, therefore we create a simulated WAN environment where a Xorp Router is used to bridge different VLANs and generate network latency in a controlled manner. We would like to share our methodology on how to test Virtual SAN Stretched Cluster.
Please note, this is only recommended for test and/or demonstration environments. We do NOT recommend using the Xorp Router in a production environment. These instructions can be used on an existing lab/test Virtual SAN or a new Virtual SAN. However, if you are using an existing Virtual SAN please plan for some down time during configuration.
Here is an example of our network diagram:
Steps:
- Install ESXi 6.0u1 on hosts
- Create (4) VLANs on the physical Ethernet Switch for VMs, Sites A, B and C with multicast/IGMP enabled on the “Site” VLANs. These are simulated sites by the way.
E.g. –
VMs – VLAN 100
Site A – VLAN 102 (Multicast/IGMP enabled)
Site B – VLAN 105 (Multicast/IGMP enabled)
Site C – VLAN 106 (Multicast/IGMP enabled)
- Create a VMware vSphere Distributed Switch (dvSwitch); create (4) Distributed Port Groups using corresponding VLANS – VMS, Site A, B and C. We are using a dvSwitch rather than a standard switch because of the management features and the single management point. Below is an example of our topology.
- Install a Linux VM with 2 vCPU, 4GB RAM, 20GB Disk, 4 NICs.
- Configure the Linux VM with the following NIC settings –
- Eth0 – dynamic or static IP, management NIC
- Eth1 – static IP, gateway address for VLAN Site A e.g. 192.168.102.253
- Eth2 – static IP, gateway address for VLAN Site B e.g. 192.168.105.253
- Eth3 – static IP, gateway address for VLAN Site C e.g. 192.168.106.253
- Download Xorp Router software and SCONs:
- http://www.xorp.org click on current release and download
- http://scons.org/ click on download (via SourceForge)
- Extract and install SCONs – RHEL/CentOS e.g.: rpm -Uvh scons-2.4.1-1.noarch.rpm For more information: http://xorp.run.montefiore.ulg.ac.be/latex2wiki/getting_started
- Extract and install Xorp – run Xorp install with superuser/root privilege
e.g.: sudo scons install
5. Change Config file multicast-config-v1.boot, enter corresponding IPs to each eth”x” and vif”x”.
Please see our example Config file below.
6. Enable IPV4 port forwarding. E.g. Under /etc/sysctl.conf change 0 to 1 for “net.ipv4.ip_forward” value
7. Add a xorp group and root user to the xorp group:
e.g. groupadd xorp
usermod -a -G xorp root
8. Disable Secure Linux (SELinux) by editing the file “/etc/selinux/config”,making sure that
the SELINUX flag is set as follows: SELINUX=disabled
9. Disable firewall:
e.g. chkconfig iptables off
chkconfig ip6tables off
service iptables stop
service ip6tables stop
10. Reboot the Linux VM
11. After reboot, in the path where Xorp is installed Start Xorp process –
e.g.: ./xorp_rtrmgr -b multicast-config-v1.boot –dReboot the Linux VM
- On each ESXi host create a static route using the Xorp IPs as the gateway for each VMKernel NIC. The gateway will be used to route to the other sites.
Site A:
E.g. Host 1 SiteA
[root@ESXHOSTA1:~] esxcli network ip route ipv4 add –network 192.168.105.0/24 –gateway 192.168.102.253
[root@ESXHOSTA1:~] esxcli network ip route ipv4 add –network 192.168.106.0/24 –gateway 192.168.102.253
E.g. Host 2 SiteA
[root@ESXHOSTA2:~] esxcli network ip route ipv4 add –network 192.168.105.0/24 –gateway 192.168.102.253
[root@ESXHOSTA2:~] esxcli network ip route ipv4 add –network 192.168.106.0/24 –gateway 192.168.102.253
Site B:
E.g. Host 1 SiteB
[root@ESXHOSTB1:~] esxcli network ip route ipv4 add –network 192.168.102.0/24 –gateway 192.168.105.253
[root@ESXHOSTB1:~] esxcli network ip route ipv4 add –network 192.168.106.0/24 –gateway 192.168.105.253
E.g. Host 2 SiteB
[root@ESXHOSTB2:~] esxcli network ip route ipv4 add –network 192.168.102.0/24 –gateway 192.168.105.253
[root@ESXHOSTB2:~] esxcli network ip route ipv4 add –network 192.168.106.0/24 –gateway 192.168.105.253
Site C:
E.g. Witness Host SiteC
[root@WITNESS:~] esxcli network ip route ipv4 add –network 192.168.102.0/24 –gateway 192.168.106.253
[root@WITNESS:~] esxcli network ip route ipv4 add –network 192.168.105.0/24 –gateway 192.168.106.253
- Validate network connectivity of the Virtual SAN Cluster:
Run vmkping from each host pinging the other Virtual SAN hosts kernel port.
E.g.: Site A host to Site B host
E.g. Site B host to Site A host
- Configure Virtual SAN Stretched Cluster between the (3) sites.
- Run a Virtual SAN Health Check on the cluster:
There is an expected warning that can be ignored:
Now the simulated cross-site network configuration for the Virtual SAN Stretched Cluster is all set, you can use the Linux native netem functionality to introduce any level of desired network latency.
To add a latency, run command tc qdisc add dev ethX root netem delay XXXms
E.g.: [root@Netem-Xorp ~]# tc qdisc add dev eth3 root netem delay 200ms
To remove a latency, run command tc qdisc del dev ethX root netem
E.g.: [root@Netem-Xorp ~]# tc qdisc delete dev eth3 root netem
There, we shared a way of simulating multi-site WAN with deterministic latency control from which Virtual SAN Stretched Cluster can be evaluated timely and cost-effectively. You can study performance impact under different latency settings, as well as validate Virtual SAN resiliency and availability under disk, disk group, host, and site failure conditions. More information can be found in the Virtual SAN Stretched Cluster Guide, happy testing!
Example Multi-Boot Config File-
/*XORP Configuration File, v1.0*/
protocols {
fib2mrib {
disable: false
}
igmp {
disable: false
interface eth2 {
vif eth2 {
disable: false
version: 2
enable-ip-router-alert-option-check: false
query-interval: 125
query-last-member-interval: 1
query-response-interval: 10
robust-count: 2
}
}
interface eth3 {
vif eth3 {
disable: false
version: 2
enable-ip-router-alert-option-check: false
query-interval: 125
query-last-member-interval: 1
query-response-interval: 10
robust-count: 2
}
}
interface eth4 {
vif eth4 {
disable: false
version: 2
enable-ip-router-alert-option-check: false
query-interval: 125
query-last-member-interval: 1
query-response-interval: 10
robust-count: 2
}
}
}
pimsm4 {
disable: false
interface eth2 {
vif eth2 {
disable: false
dr-priority: 1
hello-period: 30
hello-triggered-delay: 5
}
}
interface eth3 {
vif eth3 {
disable: false
dr-priority: 1
hello-period: 30
hello-triggered-delay: 5
}
}
interface eth4 {
vif eth4 {
disable: false
dr-priority: 1
hello-period: 30
hello-triggered-delay: 5
}
}
interface “register_vif” {
vif “register_vif” {
disable: false
dr-priority: 1
hello-period: 30
hello-triggered-delay: 5
}
}
static-rps {
rp 203.0.113.1 {
group-prefix 224.0.0.0/4 {
rp-priority: 192
hash-mask-len: 30
}
}
}
bootstrap {
disable: false
}
}
}
fea {
unicast-forwarding4 {
disable: false
forwarding-entries {
retain-on-startup: true
retain-on-shutdown: true
}
}
}
interfaces {
interface lo {
description: “Loopback”
disable: false
discard: false
unreachable: false
management: false
default-system-config {
}
}
interface eth2 {
description: “vlan102”
disable: false
discard: false
unreachable: false
management: false
vif eth2 {
disable: false
address 192.168.102.253 {
prefix-length: 24
disable: false
}
}
}
interface eth3 {
description: “vlan105”
disable: false
discard: false
unreachable: false
management: false
vif eth3 {
disable: false
address 192.168.105.253 {
prefix-length: 24
disable: false
}
}
}
interface eth4 {
description: “vlan106”
disable: false
discard: false
unreachable: false
management: false
vif eth4 {
disable: false
address 192.168.106.253 {
prefix-length: 24
disable: false
}
}
}
}
plumbing {
mfea4 {
disable: false
interface eth2 {
vif eth2 {
disable: false
}
}
interface eth3 {
vif eth3 {
disable: false
}
}
interface eth4 {
vif eth4 {
disable: false
}
}
interface “register_vif” {
vif “register_vif” {
disable: false
}
}
}
}
rtrmgr {
config-directory: “/home/xorp/”
load-file-command: “fetch”
load-file-command-args: “-o”
load-ftp-command: “fetch”
load-ftp-command-args: “-o”
load-http-command: “fetch”
load-http-command-args: “-o”
load-tftp-command: “sh -c ‘echo Not implemented 1>&2 && exit 1′”
load-tftp-command-args: “”
save-file-command: “sh -c ‘echo Not implemented 1>&2 && exit 1′”
save-file-command-args: “”
save-ftp-command: “sh -c ‘echo Not implemented 1>&2 && exit 1′”
save-ftp-command-args: “”
save-http-command: “sh -c ‘echo Not implemented 1>&2 && exit 1′”
save-http-command-args: “”
save-tftp-command: “sh -c ‘echo Not implemented 1>&2 && exit 1′”
save-tftp-command-args: “”
}