Cross-VC NSX for Multi-site Solutions

The Cross-VC NSX feature introduced in VMware NSX 6.2, allows for NSX logical networking and security support across multiple vCenters. Logical switches (LS), distributed logical routers (DLR) and distributed firewall (DFW) can now be deployed across multiple vCenter domains. These Cross-VC NSX objects are called Universal objects. The universal objects are similar to distributed logical switches, routers, and firewall except they have global or universal scope, meaning they can span multiple vCenter instances. With Cross-VC NSX functionality, in addition to the prior local-scope single vCenter objects, users can implement Universal Logical Switches (ULS), Universal Distributed Logical Routers (UDLR), and Universal DFW (UDFW) across a multi-vCenter environment that can be within a single data center site or across multiple data center sites. In this post we’ll take a look at how we do this.

The benefits of supporting NSX networking and security across multiple vCenter domains as shown in Figure 1 below become immediately clear. Logical networking and security can be enabled for application workloads that span multiple vCenters domains or physical locations. For instance, VMs can now vMotion across vCenter boundaries with consistent security policy enforcement without having to manually modify/provision networking and security services. In essence, NSX control and automation is expanded across vCenter boundaries whether within or across data centers.

Figure 1:Cross-VC NSX Deployed Across Three Sites

As prior, NSX 6.2 still maintains a 1:1 relationship between NSX Manager and vCenter server. However, with Cross-VC NSX, multiple vCenter servers are supported but each still maintains a 1:1 relationship with NSX Manager: one NSX Manager is in a primary role and the rest are in a secondary role. After assigning the primary role to the first NSX Manager, additional NSX Managers can be registered as secondary. Up to eight NSX Managers/vCenters are supported, with one NSX Manager being primary.

The Primary NSX Manager is used to deploy the Universal Control Cluster (UCC) in its local vCenter inventory/domain providing the control plane for the Cross-VC NSX environment. The Secondary NSX Managers do not have their own control cluster deployed in their local domain; instead, each vCenter domain/site and respective Secondary NSX Manager use the UCC at the primary site for both local and universal control plane and objects. Up to three controllers are supported; the UCC must be deployed into the inventory of the vCenter managed by the Primary NSX Manager.

As shown below in Figure 2, the Primary NSX Manager will use the Universal Synchronization Service (USS) to replicate only the universal objects to the Secondary NSX Managers. Note, the UCC resides only on the site of the Primary NSX Manager.

Figure 2: USS on Primary NSX Manager Replicates Universal Objects to Secondary NSX Managers

Cross-VC NSX also enhances NSX multi-site deployments. As shown in the example in Figure 3 below, NSX, leveraging the Cross-VC NSX functionality, is deployed across two sites. A separate vCenter domain is used for management where both the site 1 and site 2 respective vCenters and NSX Managers are deployed. Additionally, each site has its own vCenter deployed locally. Also note, in this design, a single Universal Control VM is deployed at site 1 and all workloads at both sites use site 1 for egress or North/South.

iBGP is used between the Universal Distributed Logical Router (UDLR) and Edge Services Gateway (ESG) and eBGP is used between the ESG and ToR switches/routers. Alternatively, OSPF could have been used. In this design, routing metric is used to control ingress/egress traffic. Setting BGP weight on the UDLR will influence which route workload traffic should take.

The Local Egress feature was also introduced in NSX 6.2 and can be used to implement localized egress or North/South at each site; this alternative deployment model can be useful for specific scenarios and will be discussed in a later post.

In this example, for simplicity of demonstration, only one ESG per site is used with both ESGs doing ECMP northbound. In a production environment multiple ESGs should be deployed at each site for additional resiliency.

Figure 3: Example Cross-VC NSX Deployment

Also deployed, but not shown in Figure 3 above is an external Platform Services Controller (PSC), which was introduced in vSphere 6.0; the PSC decouples infrastructure services such as Single Sign-On (SSO) from vCenter. Both vCenters connect to and leverage the external PSC which also allows for enhanced link mode allowing the user to manage multiple vCenters from one GUI as shown below in Figure 4; this also allows for easy vMotion via GUI from one vCenter domain to the other. For more information on PSC deployment, see the VMware vCenter Server 6.0 Deployment Guide.

Figure 4: PSC Introduced in vSphere 6.0 Allows for Enhanced Link Mode

Figure 5 below displays both NSX Managers in the setup; the NSX Manager at site 1, Palo Alto, is the Primary NSX Manager, and the NSX Manager at site 2, San Jose, is the Secondary NSX Manager. Also, in the NSX Controller nodes section, looking at the respective IP addresses, one can see there are only three NSX Controllers managed by the site 1 NSX Manager, however, configuration is shown for both sites which are using the same controllers.

Figure 5: Cross-VC NSX Deployed Across Two Sites with Primary and Secondary NSX Manager Configured

Figure 6 demonstrates the vMotion of a VM, in this case VM Web Universal, from site 1 to site 2 across vCenter boundaries.

Figure 6: vMotioning ‘Web Universal’ VM From Site 1, Palo Alto to Site 2, San Jose

Figure 7 below shows the final confirmation before the vMotion of VM Web Universal from site 1 to site 2.

Figure 7: Confirming vMotion of 'Web Universal' VM to Site 2, San Jose

Figure 7: Confirming vMotion of ‘Web Universal’ VM From Site 1, Palo Alto to Site 2, San Jose

As can be seen from the screenshot below, the Web Universal VM has been vMotioned across vCenter boundaries from site 1, Palo Alto to site 2, San Jose. Note, the network adapter of the VM is attached to the Universal Web ULS with VNI 90000 which is spanning across both sites; thus, the networking configuration remains consistent and East/West and North/South connectivity is not affected.

Figure 8: ‘Web Universal’ VM Has Been Successfully vMotioned from Site 1, Palo Alto to Site 2, San Jose

In this deployment model, dynamic routing and respective routing metrics were used to control ingress/egress (appropriate metrics on the physical network also need to be configured). Figure 9 below shows BGP weight configured on the UDLR to prefer the route through ESG 1 at site 1.

Figure 9: BGP Weight Attribute Used to Prefer Routes to ESG 1 at Site 1

As shown below, using the tracert command on the Windows VM after it has been vMotioned to site 2, one can see the site 1 ESG is still being used for egress or North/South traffic. The destination is a VM on the physical network attached to a VLAN backed port group. The result is as expected, as we set the BGP metric to always use the ESG at site 1 for egress until failover.

Figure 10: ‘Universal Web’ VM ‘tracert’ Command Shows Route Through ESG 1 at Site 1 is Being Used for Egress

The below screenshot shows that ESG 1 at site 1 has been manually shut down. Further below, we can now see the path from the Universal Web VM to the destination VM on the physical network has switched to use ESG 2 at site 2 as expected.

Figure 11: ESG 1 at Site 1 Has Been Manually Shut Down to Test Failover to ESG 2 at Site 2

As shown below, using the tracert command on the Windows VM after the ESG 1 at site 1 has been shutdown, one can see the new route from the VM is through the site 2 ESG for egress or North/South traffic. This is expected, as we set the BGP weight metric higher for ESG 1 at site 1 so ESG 2 at site 2 will only be used upon failure of ESG 1.

Figure 12: ‘Universal Web’ VM ‘tracert’ Command Shows Route Through ESG 2 at Site 2 is Being Used for Egress

In this post, a deployment model using a single Universal Control VM deployed at site 1 and leveraging routing metric for failover was utilized to demonstrate a Cross-VC NSX deployment across two sites with active workloads at both sites and Active/Passive North/South. In a later post we’ll discuss an alternative deployment model with a similar topology using Local Egress to achieve Active/Active North/South. Additionally, Cross-VC NSX supports recovery of NSX components upon site failure and this will also be discussed in a later post.