SD-WAN Migration – The Ugly Truth

Software-defined WAN (SD-WAN) is moving beyond the realm of a new technology and becoming something that IT organizations are ready to deploy. Over the past year, the conversations with many of our customers have changed from what SD-WAN can do to what is the right strategy to migrate their sites to SD-WAN.

Having the right migration strategy is important. Otherwise your rollout could get stuck, and instead of enjoying the benefits that SD-WAN provides you could end up fixing issues and dealing with complaints. After all, SD-WAN is supposed to make things simpler, isn’t it?

One element of a sound SD-WAN migration strategy is to move forward in a stepwise manner. No organization moves all of its sites at the same time. Instead, they have a mix of SD-WAN and non-SD-WAN sites until they are done cutting over all sites to SD-WAN. Depending on the size of the enterprise, this can take months or years.

Now let dive down into the technical view. A question that I get asked a lot is, “How do my SD-WAN sites communicate with my non SD-WAN (or legacy) sites?” Before answering this question, let’s first understand the problem statement of building overlay tunnels between sites. This problem statement is true whether it is SD-WAN, IPSec, or even GRE tunnels. You basically create an additional overlay network, which makes routing a little bit more complicated.

A typical MPLS site has one way in and one way out of the branch. When you start building an overlay on top of the WAN transport connecting all the sites together, you are creating multiple entrances and exits to the prefixes owned by any branch. Just imagine moving from one door in and out of your home to having tens, hundreds, or even thousands of doors. How can you track which ones get used, let alone control and apply any enforcement to them?

There are really two options to handle migration. I assume the SD-WAN sites are also hybrid WAN (which means they still have MPLS links).

Option 1: This is simple but comes with a drawback. Pick one or more transit sites to connect between your SD-WAN and legacy sites. You don’t even need to run any routing protocol on the WAN side of the SD-WAN site. Because of this, the only way SD-WAN sites can reach the legacy sites is through one or more SD-WAN sites designated as transit sites, generally regional hubs or data centers. Even though the SD-WAN branch has an MPLS link, the branch cannot just use it because all the routes point through the overlay tunnel. The nice thing about this approach is you only need to run BGP with the Service Provider PE routers at the designated transit sites. Fewer exits mean fewer headaches and routing loops to worry about.

Option 2: Depending on how far away your SD-WAN transit sites are from the SD-WAN branch and the type of applications you have, sending all traffic to transit sites may not make sense due to additional latency. Option 1 is simply unacceptable for some of my customers. So you may have to still run BGP on the WAN side of the SD-WAN branch to exchange routes directly with SP PE routers. The important part is to avoid unknowingly making the branch a transit site for other SD-WAN sites. That’s not very easy to do without complicated route maps and route leaking, so that’s why I was told some SD-WAN vendors do not suggest this option due to complexity.

Optimal Path

With VMware NSX SD-WAN by VeloCloud, we make this migration very simple to implement and support. As long as you flag the BGP neighbor as the entrance and exit point to MPLS, the NSX SD-WAN Edge by default stops the redistribution of routes learned from this neighbor into the overlay, and vice versa. Only for sites designated as transit, this redistribution rule is relaxed.

Why is this important? The SD-WAN branch learns the legacy site prefixes with a shorter AS path from the MPLS underlay. By default, it will prefer the routes learned through the MPLS underlay, thus the branch will reach the legacy sites directly through the MPLS underlay. The SD-WAN branch also learns the same legacy sites’ prefixes through the overlay, but the route will be less preferred by default. Plus, the AS path is longer because we carry AS path and other BGP attributes across our overlay. This makes these transit sites eligible to be a backup path in case the MPLS link at the branch fails. The traffic is then backhauled to the transit site and connectivity is still maintained.

Now consider this: Without these features I would have to create a complicated route-map to prevent the redistribution and also would have to adjust administrative distance. With NSX SD-WAN, I can accomplish all this with just one click on the checkbox and then a click to save. That allows me to avoid a lot of headaches and to save time as well.

Interested in learning more and easily migrating your sites to SD-WAN without the traditional IT headache? Reach out to us at sales@vmware.com.

Optimal Path

Related Posts: