Customers Hyperconverged Infrastructure Technology Partners vSAN

Simulate a Delay to Test and Validate VSAN Stretched Cluster

This blog was co-authored by Palanivenkatesan Murugan


Since the inception of Virtual SAN Stretched Cluster many current and potential Virtual SAN users have grappled with how to validate Stretched Cluster given the high cost and difficulty with setting multiple networks across geographically distributed sites without impacting production environments. The purpose of this blog is to address these concerns.

Within the Virtual SAN Product Enablement team, we test and validate many different customer scenarios using various Business Critical Applications (BCAs) as well as industry standard test tools. We share these results through our Reference Architectures. In our testing, we do not normally have the luxury of setting up cross-site Virtual SAN Stretched Clusters, therefore we create a simulated WAN environment where a Xorp Router is used to bridge different VLANs and generate network latency in a controlled manner. We would like to share our methodology on how to test Virtual SAN Stretched Cluster.

Please note, this is only recommended for test and/or demonstration environments. We do NOT recommend using the Xorp Router in a production environment. These instructions can be used on an existing lab/test Virtual SAN or a new Virtual SAN. However, if you are using an existing Virtual SAN please plan for some down time during configuration.

Here is an example of our network diagram:

arch_snag

Steps:

  1. Install ESXi 6.0u1 on hosts
  2. Create (4) VLANs on the physical Ethernet Switch for VMs, Sites A, B and C with multicast/IGMP enabled on the “Site” VLANs. These are simulated sites by the way.

E.g. –

VMs – VLAN 100

Site A – VLAN 102 (Multicast/IGMP enabled)

Site B – VLAN 105 (Multicast/IGMP enabled)

Site C – VLAN 106 (Multicast/IGMP enabled)

  1. Create a VMware vSphere Distributed Switch (dvSwitch); create (4) Distributed Port Groups using corresponding VLANS – VMS, Site A, B and C. We are using a dvSwitch rather than a standard switch because of the management features and the single management point. Below is an example of our topology.

dvswitchsummary

topology

  1. Install a Linux VM with 2 vCPU, 4GB RAM, 20GB Disk, 4 NICs.
  2. Configure the Linux VM with the following NIC settings –
    1. Eth0 – dynamic or static IP, management NIC
    2. Eth1 – static IP, gateway address for VLAN Site A e.g. 192.168.102.253
    3. Eth2 – static IP, gateway address for VLAN Site B e.g. 192.168.105.253
    4. Eth3 – static IP, gateway address for VLAN Site C e.g. 192.168.106.253

ifconfig

  1. Download Xorp Router software and SCONs:
    1. http://www.xorp.org click on current release and download
    2. http://scons.org/ click on download (via SourceForge)
    3. Extract and install SCONs –      RHEL/CentOS e.g.: rpm -Uvh scons-2.4.1-1.noarch.rpm                      For more information: http://xorp.run.montefiore.ulg.ac.be/latex2wiki/getting_started
    4. Extract and install Xorp – run Xorp install with superuser/root privilege

   e.g.: sudo scons install

 5. Change Config file multicast-config-v1.boot, enter corresponding IPs to each eth”x” and vif”x”.

   Please see our example Config file below.

 6. Enable IPV4 port forwarding. E.g. Under /etc/sysctl.conf change 0 to 1 for “net.ipv4.ip_forward” value

 7. Add a xorp group and root user to the xorp group:

           e.g.       groupadd xorp

 usermod -a -G xorp root

 8. Disable Secure Linux (SELinux) by editing the file “/etc/selinux/config”,making sure that

    the SELINUX flag is set as follows:    SELINUX=disabled

9. Disable firewall:

   e.g.        chkconfig iptables off

  chkconfig ip6tables off

  service iptables stop

  service ip6tables stop

10. Reboot the Linux VM

11. After reboot, in the path where Xorp is installed Start Xorp process –

 e.g.: ./xorp_rtrmgr -b multicast-config-v1.boot –dReboot the Linux VM

  1. On each ESXi host create a static route using the Xorp IPs as the gateway for each VMKernel NIC. The gateway will be used to route to the other sites.

Site A:

E.g. Host 1 SiteA

[root@ESXHOSTA1:~] esxcli network ip route ipv4 add –network 192.168.105.0/24 –gateway 192.168.102.253

[root@ESXHOSTA1:~] esxcli network ip route ipv4 add –network 192.168.106.0/24 –gateway 192.168.102.253

 

E.g. Host 2 SiteA

[root@ESXHOSTA2:~] esxcli network ip route ipv4 add –network 192.168.105.0/24 –gateway 192.168.102.253

[root@ESXHOSTA2:~] esxcli network ip route ipv4 add –network 192.168.106.0/24 –gateway 192.168.102.253

 

Site B:

E.g. Host 1 SiteB

[root@ESXHOSTB1:~] esxcli network ip route ipv4 add –network 192.168.102.0/24 –gateway 192.168.105.253

[root@ESXHOSTB1:~] esxcli network ip route ipv4 add –network 192.168.106.0/24 –gateway 192.168.105.253

 

E.g. Host 2 SiteB

[root@ESXHOSTB2:~] esxcli network ip route ipv4 add –network 192.168.102.0/24 –gateway 192.168.105.253

[root@ESXHOSTB2:~] esxcli network ip route ipv4 add –network 192.168.106.0/24 –gateway 192.168.105.253

 

Site C:

E.g. Witness Host SiteC

[root@WITNESS:~] esxcli network ip route ipv4 add –network 192.168.102.0/24 –gateway 192.168.106.253

[root@WITNESS:~] esxcli network ip route ipv4 add –network 192.168.105.0/24 –gateway 192.168.106.253

  1. Validate network connectivity of the Virtual SAN Cluster:

Run vmkping from each host pinging the other Virtual SAN hosts kernel port.

E.g.:  Site A host to Site B host

vmkping

E.g. Site B host to Site A host

vmkpingb

  1. Configure Virtual SAN Stretched Cluster between the (3) sites.
  1. Run a Virtual SAN Health Check on the cluster:

There is an expected warning that can be ignored:

heathck

Now the simulated cross-site network configuration for the Virtual SAN Stretched Cluster is all set, you can use the Linux native netem functionality to introduce any level of desired network latency.

To add a latency, run command tc qdisc add dev ethX root netem delay XXXms

E.g.: [root@Netem-Xorp ~]#  tc qdisc add dev eth3 root netem delay 200ms

netem

To remove a latency, run command tc qdisc del dev ethX root netem

E.g.: [root@Netem-Xorp ~]# tc qdisc delete dev eth3 root netem

netem2

There, we shared a way of simulating multi-site WAN with deterministic latency control from which Virtual SAN Stretched Cluster can be evaluated timely and cost-effectively. You can study performance impact under different latency settings, as well as validate Virtual SAN resiliency and availability under disk, disk group, host, and site failure conditions. More information can be found in the Virtual SAN Stretched Cluster Guide, happy testing!

 

Example Multi-Boot Config File-

/*XORP Configuration File, v1.0*/

protocols {

fib2mrib {

disable: false

}

igmp {

disable: false

interface eth2 {

vif eth2 {

disable: false

version: 2

enable-ip-router-alert-option-check: false

query-interval: 125

query-last-member-interval: 1

query-response-interval: 10

robust-count: 2

}

}

interface eth3 {

vif eth3 {

disable: false

version: 2

enable-ip-router-alert-option-check: false

query-interval: 125

query-last-member-interval: 1

query-response-interval: 10

robust-count: 2

}

}

interface eth4 {

vif eth4 {

disable: false

version: 2

enable-ip-router-alert-option-check: false

query-interval: 125

query-last-member-interval: 1

query-response-interval: 10

robust-count: 2

}

}

}

pimsm4 {

disable: false

interface eth2 {

vif eth2 {

disable: false

dr-priority: 1

hello-period: 30

hello-triggered-delay: 5

}

}

interface eth3 {

vif eth3 {

disable: false

dr-priority: 1

hello-period: 30

hello-triggered-delay: 5

}

}

interface eth4 {

vif eth4 {

disable: false

dr-priority: 1

hello-period: 30

hello-triggered-delay: 5

}

}

interface “register_vif” {

vif “register_vif” {

disable: false

dr-priority: 1

hello-period: 30

hello-triggered-delay: 5

}

}

static-rps {

rp 203.0.113.1 {

group-prefix 224.0.0.0/4 {

rp-priority: 192

hash-mask-len: 30

}

}

}

bootstrap {

disable: false

}

}

}

fea {

unicast-forwarding4 {

disable: false

forwarding-entries {

retain-on-startup: true

retain-on-shutdown: true

}

}

}

interfaces {

interface lo {

description: “Loopback”

disable: false

discard: false

unreachable: false

management: false

default-system-config {

}

}

interface eth2 {

description: “vlan102”

disable: false

discard: false

unreachable: false

management: false

vif eth2 {

disable: false

address 192.168.102.253 {

prefix-length: 24

disable: false

}

}

}

interface eth3 {

description: “vlan105”

disable: false

discard: false

unreachable: false

management: false

vif eth3 {

disable: false

address 192.168.105.253 {

prefix-length: 24

disable: false

}

}

}

interface eth4 {

description: “vlan106”

disable: false

discard: false

unreachable: false

management: false

vif eth4 {

disable: false

address 192.168.106.253 {

prefix-length: 24

disable: false

}

}

}

}

plumbing {

mfea4 {

disable: false

interface eth2 {

vif eth2 {

disable: false

}

}

interface eth3 {

vif eth3 {

disable: false

}

}

interface eth4 {

vif eth4 {

disable: false

}

}

interface “register_vif” {

vif “register_vif” {

disable: false

}

}

}

}

rtrmgr {

config-directory: “/home/xorp/”

load-file-command: “fetch”

load-file-command-args: “-o”

load-ftp-command: “fetch”

load-ftp-command-args: “-o”

load-http-command: “fetch”

load-http-command-args: “-o”

load-tftp-command: “sh -c ‘echo Not implemented 1>&2 && exit 1′”

load-tftp-command-args: “”

save-file-command: “sh -c ‘echo Not implemented 1>&2 && exit 1′”

save-file-command-args: “”

save-ftp-command: “sh -c ‘echo Not implemented 1>&2 && exit 1′”

save-ftp-command-args: “”

save-http-command: “sh -c ‘echo Not implemented 1>&2 && exit 1′”

save-http-command-args: “”

save-tftp-command: “sh -c ‘echo Not implemented 1>&2 && exit 1′”

save-tftp-command-args: “”

}