A number of articles have been written in the past explaining how to setup and configure VMware vCloud® Air™ Disaster Recovery. These range from how to deploy the vSphere Replication Appliance, through initiating a full failover test. However some of the considerations involved in deploying a disaster recovery solution include how to protect your infrastructure virtual machines and how you can do this easily without significant cost.
Let us begin by initially looking at what vCloud Air Disaster Recovery is.
vCloud Air Disaster Recovery is a cost effective, cloud-based disaster recovery as a service for vSphere virtual machines using vSphere replication. It provides asynchronous software replication and failover with self-service 15 minute to 24-hour Recovery Point Objectives (RPO).
The Supporting Infrastructure Dilemma
Today, vCloud Air Disaster Recovery uses a virtual private cloud (VPC) specifically designated for hosting warm standby capacity. Within this virtual private cloud you cannot run real-time virtual machines, only placeholders, which are, designated Disaster Recovery VMs.
So what about Active Directory and other critical infrastructure components?
When you look at infrastructure components like Active Directory these use their own replication technologies. Replicating Active Directory servers using vSphere Replication is not recommended. You can read more about the limitations to protect and recover virtual machines here.
Active Directory is extremely time sensitive. Every user and computer receives tokens from the domain controllers, and if any are out of sync, it will refuse authentication. In the instance of replication, it is very difficult to determine that replication of the virtual machines all happen at the same time. If VM A deltas are replicated at 10.03am and VM B which is an Active Directory Domain Controller replicates its deltas at 10.07am then when performing a recovery, VM A has a different time stamp (10.03am) than VM B expects (10.07am).
There are alternatives for these types of workloads to using replication with Disaster Recovery. Chris Colotti gives some great examples of how you can extend your Active Directory infrastructure in this video tutorial.
Using Virtual Private Cloud OnDemand
With the recent release of the VMware vCloud® Air™ Virtual Private Cloud OnDemand, this scenario has now become a lot simpler to deploy.
Previously with the subscription-based cloud you had to buy a specific amount of capacity, which if all you wanted to deploy was two Active Directory servers that was more capacity than you needed. With Virtual Private Cloud OnDemand, you simply spin up a VM when you need it, consume the exact resources you need and only pay for the resources you are consuming. This makes things simpler when deciding how to build out your infrastructure for Disaster Recovery.
Read more about Virtual Private Cloud OnDemand and how it works.
Lets take a look at how we can actually build out our pilot light VMs leveraging Virtual Private Cloud OnDemand.
In the example above, there is one physical data center (San Francisco), which has its workloads protected by replicating to vCloud Air Disaster Recovery in Virginia. We also have Virtual Private Cloud OnDemand where the Active Directory infrastructure is running.
You will notice that we have multiple IPSEC VPN tunnels configured. The first VPN is from the on-premises San Francisco data center to the on-demand VPC. We configure Active Directory to have multiple sites allowing us to leverage the built in replication of Active Directory. The Active Directory environment in both locations is kept in-sync.
You can see in the screenshot above, simply setup Active Directory sites and services in the same manner you would if it was in a physical location.
Once we have Active Directory deployed, we configure the second VPN which is between the on-demand VPC and the Disaster Recovery VPC. With this VPN in place, anytime we failover our workloads to the cloud, the workloads continue to run as expected as the authentication and look up services are still available. This allows us to continue to provide authentication and look up services simply and easily in the event of a disaster occurring in our on-premises data center.
Edge Gateway configuration
The final piece to this setup is configuring the VPN and firewall rules in vCloud Air.
Before we begin showing how you configure VPNs and Firewalls, it would be recommended to read the vCloud Air Networking Guide which explains how all the networking components fit together. You can download the guide here.
In vCloud Air we have the ability to create firewall and SNAT/DNAT rules. These are extremely important when creating VPN tunnels to ensure traffic is routed correctly.
First, we create our networks in vCloud Air. For the purpose of this example, we will not be walking through the steps to create the VPN (there are lots of articles that show these steps).
Once the networks are in place, we would configure our VPN tunnel. The screenshot below shows the basic VPN configuration:
You can see the subnets configured above match the subnets in the Active Directory sites and services settings window. It is important to make sure you configure those networks to match or the Active Directory replication will not work.
We then put in place our SNAT/DNAT rules:
For the final configuration stage, we configure our firewall rules:
Now in this example, you see that we have allowed all the traffic for the subnet in vCloud Air and our on-premises data center to be all. IPSEC encapsulates all the Windows RPC traffic, making it a very easy way to carry that traffic securely through firewalls.
However you may want to restrict the traffic further for specific Active Directory ports. The list below shows which ports you would need to open to only allow Active Directory replication.
|RPC endpoint mapper||135/tcp, 135/udp|
|Network basic input/output system (NetBIOS) name service||137/tcp, 137/udp|
|NetBIOS datagram service||138/udp|
|NetBIOS session service||139/tcp|
|RPC dynamic assignment||1024-65535/tcp|
|Server message block (SMB) over IP (Microsoft-DS)||445/tcp, 445/udp|
|Lightweight Directory Access Protocol (LDAP)||389/tcp|
|LDAP over SSL||636/tcp|
|Global catalog LDAP||3268/tcp|
|Global catalog LDAP over SSL||3269/tcp|
|Domain Name Service (DNS)||53/tcp1, 53/udp|
Source: Active Directory Replication over Firewalls https://msdn.microsoft.com/en-us/library/bb727063.aspx#ECAA
Let’s review what has been discussed in this article.
It is important to understand the architecture of your applications when building a Disaster Recovery environment. Simply replicating the Virtual Machine running that application may not be sufficient to keep the service running in the event of a failure.
We can easily extend our data center infrastructure to the cloud by using the same technologies we use everyday in multi-site data center deployments.
By leveraging the cloud the ability to provide disaster recovery solutions has become quick and easy. It is also cost effective when consuming on-demand services.
If you’re ready to get started with vCloud Air Disaster Recovery, visit vCloud.VMware.com.
For future updates, follow us on Twitter and Facebook at @vCloud and Facebook.com/VMwarevCloud.
One comment has been added so far
Sites and Services not matching your diagram, can you explain what is what? Or is WDC the SF site? Where is Vegas? Very confused.