Most of the time, SRM is deployed as a single site to single site solution. This is something SRM has done well since the first version. What isn't as widely known is all the different ways that SRM works in multi-site configurations. Deploying SRM in multi-site topologies provides flexibility to support a wide variety of use cases and business requirements. To learn more about the creative ways SRM can be setup in multiple site arrangements, read on!
SRM and Stretched Clusters
SRM and stretched clusters, either VMware Virtual SAN Stretched Clusters or vSphere Metro Storage Clusters (vMSC), work well together to provide local as well as remote disaster recovery. This solution provides rapid recovery from widespread disasters that affect both sites in a stretched cluster. This is accomplished by having the stretched cluster support protection of the VMs at the local sites, and having SRM provide protection for VMs running at one or both of the local/stretched cluster sites to a third site, and by treating the stretched cluster sites as a single site from the SRM perspective. Where this solution works especially well is when the two stretched storage sites are close enough as to be impacted by the same event. An added benefit of this solution is support for zero-downtime disaster avoidance between the sites that make up the stretched cluster.
Consider datacenters situated in New York City and across the river in New Jersey configured with a VMware Virtual SAN stretched cluster with the third site located in Arizona. If either the New York or New Jersey site went offline, VMware Virtual SAN and vSphere HA would automatically recover the VMs at the other site. If both New York metro area data centers were affected, SRM could be used to perform a rapid, reliable recovery at the data center in Arizona. The approach provides multiple levels of resiliency and risk reduction.
Shared Recovery and Shared Protection Sites
The shared recovery site model has been supported by SRM for a long time and is one of the most common of the multi-site topologies in use. It provides a good option to protect virtual machines at remote sites that have local vCenters. This model can support flexibility when it comes to resources at the recovery site. To minimize resource requirements, provide only enough resources at the shared recovery site to recover one or two of the remote sites. Or if the ability to recover all remote sites at the same time is required, sufficient resources to accomplish that can be provided, or some option in between.
SRM enables precise control over which workloads are failed over. For example, if a disaster only impacts one remote site, SRM can be designed to recover only the VMs at that site while still protecting the other sites. If budget to build out a DR site is limited, it is possible to provision just enough resources to facilitate the failover of one or two of the remote sites since it is unlikely that a number of geographically separate sites will need to perform disaster recovery at the same time. Having a vCenter at the protected and recovery sites allows both of these options to offer the advantage of supporting vSphere management operations even if the link between the primary and remote sites is down.
Shared Recovery Site - Central vCenter Option
There are other ways of providing the same kind of protection offered by the shared recovery site configuration discussed previously. In the configuration shown above, instead of deploying a vCenter and SRM server at each remote site, we deploy the protected site vCenter and SRM server at the recovery site. The advantage to this arrangement is that it simplifies management and reduces the number of vCenters and SRM servers required for remote sites. The downside is that if the link to the remote site is down, the remote site hosts would lose connectivity to their vCenter with the resulting loss of manageability. Note that as with any correct SRM configuration, recovery would still work without issue even if the remote site link is unavailable.
Additional Multi-Site Topologies
In addition to the above topologies, as of SRM 5.8 and higher, SRM now supports other arrangements as long as the following requirements are met:
As seen in the diagrams in this post, SRM requires that it be deployed in pairs
SRM supports a maximum of 10 SRM server pairs
- As outlined in the diagrams there needs to be a vCenter server at each "site" (in the central vCenter option above we are treating all remote sites as a single protected "site")
- Each VM can only be protected a single time/by a single SRM pair
- SRM only supports point-to-point replication. It does not currently support replication to multiple destinations
This topology would support a three (or more) site topology where:
- Virtual Machines at Site A are protected at Site B
- Virtual Machines at Site B are protected at Site C
- Virtual Machines at Site C are protected at Site A
Additional notes on SRM multi-site
Multi-site topologies work quite well for data center consolidation as well. SRM is already an excellent solution for migrations. Multi-site topologies provide additional flexibility to allow SRM to handle any number of complex scenarios.
The examples above aren't close to a complete list of possible topologies. As long as the design adheres to the requirements listed above there are numerous other configurations that would work.
With all of the above examples, SRM can orchestrate and automate the reprotection of workloads after failover or migration. This provides the ability not just to failover but to failback.
SRM multi-site management improvements
In the latest releases of SRM (5.8, 6.0 and 6.1) there have also been significant improvements to the use and manageability of SRM multi-site configuration. First, it is now possible to convert an existing SRM installation to multi-site. Previously SRM had to be configured from the initial installation to support multi-sites. SRM now supports configuration for multi-site regardless of how it was initially configured.
Second, with the migration of SRM to the vSphere Web Client the management of multiple sites has been vastly improved and simplified. Sites, Protection Groups and Recovery plans are now organized around the configuration of their pairing. This makes daily operations much easier and more intuitive.
- SRM supports a number of different options when it comes to multi-site configurations
- Pay attention to the limits and requirements for SRM and multi-site
- Take advantage of the recent improvements to SRM around management and usability