Selecting a migration strategy

As a consultant within the NSX PSO practice, one of the conversations that comes up with customers often is how NSX enables migration from a legacy datacentre to an NSX managed datacentre. This was the case with a customer recently who were looking to move out of a datacentre that was scheduled to be decommissioned. The problem was that the customer workloads needed to be migrated to a Logical Switch within the new datacentre without changing IP addressing, and with minimal downtime.

There are four approaches available to us with NSX for vSphere that might help solve this problem:

  • Universal Logical Switching – we could deploy NSX to the remote site and extend L2 networks using Cross-vCenter NSX and Universal Logical Switches, then migrate the workload
  • Native L2 Bridging – within the same datacentre we could use the NSX Distributed Logical Router native functionality to create a Layer 2 Bridge between a VLAN and a Logical Switch
  • Hardware VTEP – using a compatible hardware device from a VMware Partner that acts as a VXLAN Tunnel Endpoint and can bridge between a VLAN and a Logical Switch
  • Layer 2 VPN – using an NSX managed Edge, or a Standalone NSX Edge L2VPN Client in the remote datacentre we can bridge the remote VLAN to a local Logical Switch

In my customer’s case it did not make sense to deploy NSX to the remote datacentre as the bridging requirement was not long-term, and the level of work required to prepare the legacy site would have been significant – including investing in new hardware. The native bridging functionality with the Distributed Logical Router would need the VLAN to be extended across datacentres. Using a Hardware VTEP would again require investment into a datacentre that is schedule to be decommissioned.

The solution selected would work without requiring investment into the retiring datacentre – L2VPN. Layer 2 VPN, as the name suggests, works by creating a VPN tunnel between two NSX Edge devices and bridging Layer 2 traffic between interfaces on each side. L2VPN is not the preferred long-term solution to stretch networks, however as a solution for migrating workloads in and out of a datacentre, it works well.

Migrating from a legacy datacentre

The high-level process for migrating out of the datacentre follows these steps:

  • Deploy an NSX Edge as a Layer 2 VPN Server in the NSX Managed Site
  • Deploy an NSX Standalone Edge as a Layer 2 VPN Client in the Standalone Site
  • Migrate workloads from the Standalone Site to the NSX Managed Site
  • Migrate the gateway and routing from the Standalone Site to the NSX Managed Site

The initial configuration of the two sites is shown below – the NSX Managed Site is set up with a typical Provider Logical Router (PLR) that provides North/South routing for the datacentre, with a Distributed Logical Router (DLR) beneath it, to which the Logical Switches are connected. On the Standalone Site the Customer Workloads are deployed onto VLAN 20 with the Default Gateway provided by a physical router.

Initial site configuration

Initial site configuration

Deploy the NSX Edge Layer 2 VPN Server

Into the NSX Managed Site we deploy a Logical Switch (to be bridged with the VLAN), and the NSX Edge with Layer 2 VPN Server configured. In this example it’s been deployed with an uplink on a “Transit” Logical Switch, but in reality, it just needs an IP that is routable and available from the Standalone Site. The internal interface is connected to a trunk port group, and a sub-interface is connected to the Logical Switch. The sub-interface is assigned a free IP address from the VLAN that will be bridged.

The Layer 2 VPN Server has been deployed

The Layer 2 VPN Server has been deployed

Deploy the NSX Standalone Edge L2VPN Client

The NSX Standalone Edge Client is deployed into the Standalone Site and has an uplink connected to VLAN 10 (a routable network). Its internal interface is again configured to a trunk port group and a sub-interface is attached to VLAN 20.

At this point in time, nothing has changed for the two VMs sat in the Standalone Site – they are still layer 2 adjacent with each other and are sending their routed traffic over the VLAN to the default gateway on the interface of the physical router. However, the Layer 2 VPN Client has established a tunnel to the Layer 2 VPN Server, and the VMs are now Layer 2 adjacent to the interface on the Logical Switch in the NSX Managed Site.

The Layer 2 VPN Client has been deployed, and the L2VPN established

The Layer 2 VPN Client has been deployed, and the L2VPN established

Migrating Workloads

Now, using VMware Site Recovery Manager, stretched storage with Cross vCenter vMotion, or a similar tool the workloads can begin to be migrated to the NSX Managed Site. The network’s default gateway still resides in the Standalone Site, so the route that the VM’s north/south traffic takes is not particularly efficient – however connectivity is maintained as well as layer 2 adjacency with the other workloads on VLAN 20.

The diagram below traces a packet from the VM in the NSX Managed Site to both the VM in the remote site (blue arrows), and northbound (green arrows). In both cases, the packet traverses the Layer 2 VPN (red arrows)

  • The packet is received by the L2VPN-Server , encapsulated and sent on through the PLR
  • The PLR forwards the encapsulated packet to the physical router
  • The physical router forwards the packet across the WAN to the physical router on the Standalone Site
  • The router on the standalone site forwards the packet to the L2VPN-Client
  • The L2VPN-Client de-encapsulates the packet
  • If the packet is intended for a VM on VLAN 20, it’s forwarded to the VM’s interface (blue arrow)
  • Or if the packet is to be routed, it’s forwarded back up to the physical router’s interface on VLAN 20 (green arrow)
Packets traversing the Layer 2 VPN

Packets traversing the Layer 2 VPN

Migrating the default gateway

As more workloads are migrated to the NSX Managed Site the amount of traffic traversing the L2VPN will increase. There is a “tipping point” when it’s more efficient to relocate the gateway to the Distributed Logical Router.

At this point the interface for the VLAN is removed from the physical router in the Standalone Site. The Logical Switch in the NSX Managed Site is attached to the Distributed Logical Router, and the IP address of the default gateway is configured on the Distributed Logical Router interface. The route tables of the physical routers can be manually updated or updated using dynamic routing protocols to ensure traffic is now routed to the Distributed Logical Router for that subnet.

Traffic from the VMs in the NSX Managed Site now uses the Distributed Logical Router as it’s default gateway, and traffic from VMs in the Standalone Site now traverses the Layer 2 VPN to reach its default gateway, or other VMs located on the Logical Switch.

Routing and default gateway are migrated to the NSX Managed Site

Routing and default gateway are migrated to the NSX Managed Site

Migration completed

When all the VMs have been migrated to the NSX Managed Site the Standalone Edge Client can be shut down and deleted from the Standalone Site, and the Layer 2 VPN Server can be removed from the NSX Manager Site.

All workloads have been migrated, and the L2 VPN removed

All workloads have been migrated, and the L2 VPN removed

Design considerations

There are some design considerations to bear in mind when looking at using a Layer 2 VPN. It’s not what I’d recommend as a long-term solution for L2 Bridging. If you need a more permanant, or a high performance option, take a look at one of the other solutions listed at the top of this blog post.

  • Performance – it should be fairly obvious that the performance of this solution is going to be limited by the Edge devices, which are encapsulating, encrypting, decrypting and de-encapsulating data and maintaining all the information required to bridge the VLAN and Logical Switch. There are a lot of variables that will affect the throughput of a L2VPN, from the type of traffic being sent, the networks being traversed and the hosts themselves.
  • Management – although the set-up of the L2VPN is relatively simple, there are some configuration requirements (such as configuring a sink port) which are not. This adds complexity to the solution, and the effects of such configuration needs to be clearly understood.
  • Availability – both ends of the L2VPN solution can be configured with NSX Edge HA mode, with an Active/Passive Edge in case of host failure. I would strongly recommend configuring HA mode, as the loss of a single host in one site could isolate VMs on another site.

In my next post I will be building the Layer 2 VPN in my lab environment and demonstrating the migration of VMs from the Standalone Site across to the Logical Switch in the NSX Managed Site.