Home > Blogs > VMware Consulting Blog > Tag Archives: Disaster Recovery

Tag Archives: Disaster Recovery

How NSX Simplifies and Enables True Disaster Recovery with Site Recovery Manager

Dharma RajanBy Dharma Rajan

VMware Network Virtualization Platform (NSX) is the network virtualization platform for the software-defined datacenter (SDDC). Network virtualization using VMware NSX enables virtual networks to be created as software entities, saved and restored, and deleted on demand without requiring any reconfiguration of the physical network. Logical network entities like logical switch, logical routers, security objects, logical load balancers, distributed firewall rules and service composer rules are created as part of virtualizing the network.

To provide continuity of service from disaster recovery (DR), datacenters are built with capabilities for replicating and recovering workloads between protected and recovery sites. VMware Site Recovery Manager (SRM) helps to fully automate the recovery process.

From a DR point the recovery site has to be in synch with the protected site at all times from a compute, storage and networking point of view to enable seamless fast recovery when the protected site fails due to a disaster. When using SRM today for DR there are a couple of challenges customers face. From a compute perspective one needs to prepare the host at the recovery site, pre-allocate compute capacity for placeholder virtual machines and create placeholder virtual machines themselves.

From a storage point, the storage for protected applications/virtual machines needs to be replicated and kept in synch. Both of these steps are easy and has been handled by SRM-, vSphere- and/or Array-based replication. The challenge today is the networking piece of the puzzle. As illustrated below, depending on the type of networking established between protected and recovery site, various networking changes (carve out Layer-2, Layer-3, Firewall, Load balancer policy in recovery site, re-map of network if IP address space overlap, recreate policies, etc.) may have to be manually done to ensure smooth recovery. This adds a lot of time, subject to human error in making the changes, inability to meet internal and external SLA. The result of this is the network is the bottleneck that prevents seamless disaster recovery. From a business perspective this can easily translate into millions of dollars in business loss based on criticality of workloads/services impacted.

DRajan 1

Why Are We Running into the Networking Challenge?

The traditional DR solution is tied tightly to physical infrastructure (physical routers, switches, firewalls, load balancers). The security domains of the protected and recovery sites are completely separate. As networking changes, be it new adds, delete, updates are made (say IP address, Layer-2 extension changes, subnets, etc.) at the protected site, no corresponding automated synchronization happens at the recovery site. Thus one may have to do Layer-2 extension to preserve the changes, create and maintain special scripts, manage the tools, and perform manual DR setup and recovery steps across different infrastructure layers and vendors (physical and virtual). From a process point it requires coordination across various teams within your company, good bookkeeping and periodic validation, so you are always ready to address a DR scenario as quickly as you can.

What is the Solution?

VMware NSX from release 6.2 offers a solution that enables customers to address the above-cited networking challenges. NSX is the network virtualization platform for the SDDC. NSX provides the basic foundation to virtualize networking components in the form of logical switching, distributed logical router, distributed logical firewall, logical load balancer, and logical edge gateways. For a deeper understanding of NSX see more at: http://www.vmware.com/products/nsx

NSX 6.2 release has been integrated with SRM 6.1 to enable automated replication of networking entities between protected and recovery sites.

DRajan 2

How Does the Solution Work?

NSX 6.2 supports a couple of key concepts that will intelligently understand that it is logically the same network on both sites. These concepts include:

  1. a) “Universal Logical Switches” (ULS) – This allows for the creation of Layer-2 networks that span vCenter boundaries. This means that when utilizing ULS with NSX there will be a virtual port group at both the protected and recovery site that connect to the same Layer-2 network. When virtual machines are connected to port groups that are backed by ULS, SRM implicitly creates a network mapping, without requiring the admin to configure it. Providing seamless network services portability and synchronization automatically reconnects virtual machines connected to a ULS to the same logical switch on the other vCenter.

DRajan 3

NSX 6.2 ULS Integration with SRM 6.1 Automatic Network Mapping

  1. b) Cross vCenter Networking and Security enables key use cases such as:
  • Resource pooling, virtual machine mobility, multi-site and disaster recovery
  • Cross-vCenter NSX eliminates the need for guest customization of IP addresses

and management of portgroup mappings, two large SRM pain points today

  • Centralized management of universal objects, reducing administration effort
  • Increased mobility of workloads; virtual machines can be “vMotioned” across vCenter Servers without having to reconfigure the virtual machine or making changes to firewall rules

The deployment process would ideally be to:

  • Configure Master NSX Manager at primary site and Secondary NSX Manager at recovery site
  • Configure Universal Distributed Logical Router between primary and secondary site
  • Deploy Universal Logical Switch between primary and recovery site and connect it to Universal Distributed Logical Router
  • Deploy the VRO plugin for automation and monitoring
  • Finally map SRM network resources between primary and recovery sites

Supported Use Cases and Deployment Architectures

The primary use cases are full site disaster recovery scenarios or unplanned outage where the primary site can go down due to a disaster and secondary site takes immediate control and enables business continuity. The other key use case is planned datacenter migration scenarios where one could migrate workloads from one site to another maintaining the underlying networking and security profiles. The main difference between the two use cases is the frequency of the synchronization runs. In a datacenter migration use case you can take one datacenter running NSX and reproduce the entire networking configuration on the DR side in a single run of the synchronization workflow or run it once initially and then a second time to incrementally update the NSX objects before cutover.

DRajan 4

Other supported use cases include partial site outages, preventive failover, or when you anticipate a potential datacenter outage, for example, impending events like hurricanes, floods, forced evacuation, etc.

The standard 1:1 deployment model with one site as primary and another as secondary is the most common deployed model. In a shared recovery site configuration, like for branch offices, you install one SRM server instance and NSX on each protected site. On the recovery site, you install multiple SRM Server instances to pair with each SRM server instance on the protected sites. All of the SRM server instances on the shared recovery site connect to the same vCenter server and NSX instance. You can consider the owner of an SRM server pair to be a customer of the shared recovery site. You can use either array-based replication or vSphere replication or a combination of both when you configure an SRM server to use a shared recovery site.

DRajan 5

Logical Disaster Recovery Architecture Using NSX Universal Objects

What Deployment Architecture Will the Solution Support?

This solution applies to all Greenfield and Brownfield environments. The solution will need the infrastructure to be base-lined to vCenter 6.0 or later, ESXi 6.0 or later, vSphere Distributed switch, SRM 6.0 or later with NSX 6.2 or later.

SRM can be used for different failover scenarios. It could be Active-Active, Active-Passive, Bidirectional, and Shared Recovery.

Integrated Solution Advantages

The ability to automate the disaster recovery planning, maintenance and testing process becomes much simpler, with automation enabling significant operational efficiencies.

  • The ability to create a network that spans vCenter boundaries creates a cross-site Layer-2 network, which means that after failover, it is no longer necessary to re-configure IP addresses. Not having to re-IP recovered virtual machines can further reduce recovery time by up to 40 percent.
  • There is more automation with networking and security objects. Logical switching, logical routing, security policies (such as security groups), firewall settings and edge configurations are also preserved on recovered virtual machines, further decreasing the need for manual configurations post-recovery.
  • Making an isolated test network with all the same capabilities identical to a production environment becomes much easier.

In conclusion, the integration of NSX and SRM greatly simplifies operations, lowers operational expenses, increases testing capabilities and reduces recovery times.

For more information on NSX visit: http://www.vmware.com/products/nsx/

For more information on SRM visit: http://www.vmware.com/products/site-recovery-manager/

For more information on VMware Professional Services visit: http://www.vmware.com/consulting/

 


About the Author:

Dharma Rajan is a Solution Architect in the Professional Services Organization specializing in pre-sales for SDDC and driving NSX technology solutions to the field. His experience spans Enterprise and Carrier Networks. He holds an MS degree in Computer Engineering from NCSU and M.Tech degree in CAD from IIT

BCDR Strategy: Three Critical Questions

Jeremy Carter headshotBy Jeremy Carter, VMware Senior Consultant

Organizations in every industry are increasingly dependent on technology, making increased resiliency and decreased downtime a critical priority. In fact, Forrester cites resiliency as the number three overall infrastructure priority this year.

A business continuity solution that utilizes the virtual infrastructure, like the one VMware offers, can greatly simplify the process, though IT still needs to understand how all the pieces of their business continuity and disaster recover (BCDR) strategy fit together.

I often run up against the expectation of a one-size-fits-all BCDR solution. Instead it’s helpful to understand the three key facets of IT resilience—data protection, local application availability, and site application availability—and how different tools protect each one, for both planned and unplanned downtime (see the diagram below). If you’d like to learn more on that front, there is a free two-part webcast coming up that I recommend you sign up for here.

As important as it is to find the right tool, you only know a tool is “right” if it meets a set of clearly defined business objectives. That’s why I recommend that organizations start their BCDR planning with a few high-level questions to help them assess their business needs.

1. What is truly critical?

Almost everyone’s initial response is that they want to protect everything, but when you look at the trade-off in complexity, you’ll quickly recognize the need to prioritize.

An important (and sometimes overlooked) step in this decision-making process is to check in with the business users who will be affected. They might surprise you. For instance, I was working with a government organization where IT assumed everything was super critical. When we talked to the business users, it turned out they had all of their information on paper forms that would then be entered into the computer. If the computer went down, they would lose almost no data.

On the other hand, the organization’s 911 center’s data was extremely critical and any downtime or loss of data could have catastrophic consequences. Understanding what could be deprioritized allowed us to spend the time (and money) properly protecting the 911 center.

As we move further into cloud computing, another option is emerging: Let the application owners decide at deployment. With tools like vCloud Automation Center (vCAC), we can define resources with differing service levels. An oil company I recently worked with integrated SRM with vCAC so that any applications deployed into Gold or Silver tiers would be protected by SRM.

Planned Downtime Unplanned Downtime VMware Data Protection2. Which failures are you preventing?

Each level of the data center has its preferred method of protection, although all areas also need to work together. If you’re concerned about preventing failures within the data center, maybe you rely on HA and App HA; however, if you want to protect the entire datacenter, you’ll need SRM and vSphere Replication (again, see chart).

3. RTO, RPO, MTD?

Another helpful step in choosing the best BCDR strategy is to define a recovery time objective (RTO), recovery point objective (RPO) and maximum tolerable downtime (MTD) for both critical and non-critical systems.

These objectives are often dictated by a contract or legal regulations that require a certain percentage of uptime. When established internally, they should take many factors into account, including if data exists elsewhere and the repercussions of downtime, especially financial ones.

The final step in the implementation of any successful IT strategy is not a question, but rather an ongoing diligence. Remember that your BCDR strategy is a living entity—you can’t just set it and forget it. Every time you make a change to the infrastructure or add a new application, you’ll need to work it into the BCDR plans. But I hope that each update will be a little easier now that you know the right questions to ask.


VMware-WebcastWant to learn more about building out a holistic business continuity and disaster recovery strategy?
Join these two great (free) webcasts that are right around the corner.

Implementing a Holistic BC/DR Strategy with VMware – Part One
Tuesday, February 18 – 10 a.m. PST

Technical Deep Dive – Implementing a Holistic BC/DR Strategy with VMware – Part Two
Tuesday, February 25 – 10 a.m. PST


Jeremy Carter is a VMware Senior Consultant with special expertise in BCDR and cloud automation. Although he joined VMware just three months ago, he has worked in the IT industry for more than 14 years. 

Virtualize SAP – Risky or Not?

By Girish Manmadkar, VMWare Professional Services Consultant

In years past, some IT managers were not ready to talk about virtualizing SAP due to technical and political reasons. The picture is very different today, in part because of the increased emphasis on IT as a strategic function towards ‘Software–Defined Data Center’ (SDDC).

Virtualization and the road to SDDC expands the cost and operational benefits of server virtualization to all data center infrastructure—network, security, storage, and management. For example, peak workloads such as running consolidated financial reports are handled much more effectively, thanks to streamlined provisioning. Integrating systems because of company acquisitions are more easily managed due to the flexibility offered with virtualized platforms. And finally customers are leveraging their virtualized SAP environment to add additional capabilities such as enhanced disaster recovery/business continuity or chargeback systems.

Many customers have been realizing virtualization benefits ever since they moved their SAP production workloads to the VMware platform. As IT budgets continue to shrink, the imperative to lower operating costs becomes more urgent—and virtualization can make a real difference. Server consolidation through virtualization translates directly into lower costs for power, cooling, and space—and boosts the organizations “green” profile in the bargain.

Organizations Benefit from Virtualizing SAP

The main requirement for any IT manager supporting an SAP environment is to ensure high availability —even a few minutes of downtime can cause loss of dollars, not to mention angry phone calls from executive management as well as frustrated users. VMware virtualization takes advantage of SAP’s high-availability features to ensure that the SAP software stays running without any interruption and helps keep those phonelines quiet.

Greenfield SAP deployments are a great way to start building the environment right from ground zero by utilizing a building-block approach. You will start seeing the benefits of flexibility, scalability and availability of the newly built environment on VMware.

Upgrades comes with two scenario’s

  • A. SAP hardware refresh cycle
  • B. An SAP Application and/or database upgrade

Upgrades are a part of every SAP landscape and they can be complex and require long-term efforts. I have seen that most of my customers who go through their standard physical environment for SAP upgrades, spend many man hours or even days – if they have the hardware available at their disposal. However, in the virtual environment, the provisioning process is pretty rapid and can be executed in minutes, including the deprovisioning to reclaim required resources back in the resource pool which makes the upgrade process that much more streamlined and efficient. When going through an SAP upgrade – a very time and cost sensitive project, it is very important to provide required resources to the development team in a timely manner.

Time to Move

Let’s say that you’ve decided to virtualize your SAP environment—now the question is timing. I have seen many customers take the SAP upgrade and/or platform or hardware refresh as possible opportunities to move to the virtual platform.

A planned SAP upgrade can be a good time to move. I have seen some customers cash in on the planned move to SAP NetWeaver & other add-ons to virtualize their entire SAP landscapes—with savings of more than half of their capital expenses.

A hardware refresh is a great time to move. Many customers take advantage of the change in hardware to also consider a migration to virtualization at the same time. It allows customer to integrate the hardware refresh and virtualization projects to minimize disruptions and combine staff training for new hardware and software.

SAP Requirements: Security,Compliance and Disaster Recovery

Challenges like compliance and security policies often require substantial infrastructure changes, that can highlight the inherent inflexibility of the existing traditional hardware platform and persuade top management to invest in infrastructure. Many customers have successfully implemented VMware-provided solutions to ensure the security and compliance of their SAP environment so that they can experience the benefits from virtualization.

Disaster Recovery
A Business Continuity plan is imperative for many of our SAP customers. Disasters – a natural or man-made disaster severely impacts operation which impacts the bottom line. Which of course, is the reason why executives often order a review of the company’s disaster recovery/business continuity plans. VMware understands this importance and the risk which is addressed by VMware Site Recovery Manager product.

So is virtualizing your platform for your SAP environment too risky? All IT projects have risk. Is it so risky to pass up the benefits of virtualization? In my opinion, no – not if you follow the advice and methodology offered by my colleagues, David Gallant (Business as usual with Tier 1 Business Critical Applications? – Not!) and Eiad (Knowing Your Applications is Key to Successful Data Center Transformation). I ask you – if you haven’t already virtualized your SAP environment, why not explore it now? There’s been so many advances in technology and alliances, you can’t ignore it any longer.

Girish Manmadkar is a veteran VMware SAP Virtualization Architect with extensive knowledge and hand-on experience on various SAP and VMware products including various databases. He focuses on SAP migrations, architecture designs, and implementation, including disaster recovery.