VMware

Welcome to the Uptime Blog | Main | Site Recovery Manager Demonstration Video now online!

12/22/2008

VMware Site Recovery Manager - "From general release to Update1, what have we learnt and what's new?"

In this post we will focus on the activity that has surrounded VMware Site Recovery Manager (SRM) since its launch. According to our own download page Site Recovery Manager (SRM) build 97878 has been available since 2008/06/19 so what has been happening with SRM in the field since that date? The short answer is a lot! SRM has been written about in many places, VMware  evangelists (yes that’s you Mike!) now have books specifically dedicated to SRM administration and SRM has been a big draw at all VMware stands at the various shows we have put on or attended during 2008.


As one of VMware's technical folks my main interest is what has been good, bad or plain ugly from an implementation point of view. I spend a lot of my time assisting customers and partners with their SRM deployments, configuration woes and the like so I wanted to give you all a quick run down of some of the key gotchas as well as some pointers to useful references for SRM help. Going forward I hope to post further blog updates focusing on specific topics of SRM implementation including closer looks at networking, storage replication integration, sample architectures, customizing recovery plans to name a few. Getting back to some of the common questions that come up let's get started.

 

Q.     We have installed SRM but cannot see any SRM screens inside the vCenter Client?

To make the SRM icon and screens available you must download/install the SRM plugin via the vCenter Client “Manage Plugins” menu. Your vCenter userid will also need the appropriate privileges to be able to work with SRM.

Blog1 


Q.     Where should we install SRM (and the SRA)?

o    In a VM?

o    On the same VM as vCenter?

o    Should the SRM database reside alongside the vCenter db?

o    Can the SRM database be of a different type? i.e. Oracle?

It can depend on a lot of factors some of which we have listed below. For POC/Eval and test environment most customers will deploy the two SRM servers (and their databases) alongside their vCenter servers, within the same virtual machines. For production environments the reality of day to day operational processes will probably mean the SRM server and SRA (storage replication adapter) will be installed alongside each other in their own virtual machine to make tasks such as raising change requests for maintenance / patching more straightforward. Other factors will include:

o    Size of VI environment (number of ESX hosts/number of VMs)

o    Small number of hosts & VMs  can mean customers deploying SRM in same VM as vCenter as typically in this configuration the vCenter server is lightly loaded.

o    Larger number of hosts & VMs customers installing SRM components in separate VM

o    Type of SRA being used can be a factor i.e. does your SRA need access to “admin” LUN(s) to communicate with storage?

Q.     We download the SRA for our storage platform from vmware.com install, no other checks needed, is that correct we just go ahead and install?

Not quite, each vendor provides a readme, you must ensure you review this first. Second each storage vendor also generally supplies a whitepaper / technote covering best practice implementation for setting up their adapter (SRA), ensure you seek these out! Links to documentation from storage vendors can be found on the SRM resources page:  http://www.vmware.com/products/srm/resource.html

 

Q.     Do all the of SRA adapters communicate with their respective storage arrays in the same way?

No, again each vendor’s architecture is different for connectivity some require the installation of a client side remote command suite (provided by the storage vendor) some don’t. Again review your storage vendors readme and implementation guide and if you have one, speak to your storage team. Don’t forget the SRA’s are supported by the storage vendors so if you do have issues with the adapters you can raise support requests with your storage vendor assuming you have a valid support contract.


Q.     Our Storage Replication Adapter (SRA) is installed correctly; all seems ok however in the datastore groups screen no datastores appear?

If the replicated VMFS datastores are empty i.e. contain no VMs then the datastore will not appear. Add VMs into the datastore(s) and use the rescan arrays button to update the view.

Blog2 


Q.     When creating a protection group SRM prompts for a datastore location to house “Placeholder VMs” what are these used for?

Placeholder virtual machines are used to identify a location of the recovered VM in within the recovery site vCenter inventory. SRM will replace the placeholder VM with the VM registered from the replicated storage during testing / failover.


Q.     During the install process port 80 is defined as the communication port for vCenter, can this be changed?

Even though SRM uses SSL when it communicates to vCenter, it does not use port 443. SRM establishes a TCP connection to port 80, then uses an HTTP CONNECT request to establish a tunnel to the vCenter server, then does an SSL handshake with vCenter over that tunnelled connection. The SRM installation enforces these semantics.

 

Q.     Which datastore should be selected to hold the placeholder VMs? What to consider?

The first recommendation would be to locate all of the placeholder virtual machines in the same datastore at the recovery site. If all the placeholder virtual machines are located in the same datastore at the recovery site they will be easier to locate should you need to and equally simpler to locate should you need to perform any troubleshooting.


Having a small datastore set aside for use only as the SRM placeholder virtual machine datastore will also mean you are not placing them in datastores at the recovery site that contain actual virtual machines that reside at that site permanently. vCenter users not authorized to use or familiar with SRM may find it confusing should they stumble across a placeholder virtual machines folder lying within a datastore normally used for other virtual machines. Other factors to consider:

o    Datastore needs to reside at the recovery site.

o    Datastore does not need to be replicated.

o    Sizing - datastore will only contain VM config files (*.vmx, *.vmxf, *.vmsd (typically 3 files < 1KB each).

   Q.   Which vCenter object is SRM enabled on, Host, Cluster, Resource Pool?

In SRM the basic unit of replication is the datastore. Recovered VMs can be placed on arbitrary hosts/clusters, as long as the hosts can access the replicated datastores.


Q.     Do all VM’s we are protecting need to be in a cluster?

No. SRM only requires separate vCenter instances. One managing protected site and other managing recovery site.


Q.     For failover how does SRM guarantee resources at the recovery site?

SRM can suspend local VM’s at the recovery site as part of recovery plan. Best practice is to also use resource pools at the protected site and map these to resource pools at the recovery site using SRM “Inventory Mapping”


Q.     We see that the “Recompute Datastore Group” task run periodically within vCenter since we installed SRM, what triggers these tasks?

        Blog3 

Datastore Group computation is triggered by the following events:

o    Existing VM is deleted or unregistered

o    VM is storage vmotioned to a different datastore

o    New disk is attached to VM on a datastore previously not used by the VM

o    New datastore is created

o    Existing datastore is expanded


Q.   Occasionally when we login to the SRM screens we see the sites pairing status displayed as “Low Resources on Paired Site” what causes this?

        Blog4 

The “Low Resources…” message can be generated if any of the following conditions are true on the server (VM) where SRM is installed:

 

o    Remote site free disk space drops below 100 Mb (default)

o    Remote site CPU usage goes above 70 % (default)

o    Remote site available memory drops below 32 Mb (default)

 

These are default values which can be configured by modifying the vmware-dr.xml file located in C:\Program Files\VMware\VMware Site Recovery Manager\config. The fields to modify are minDiskSpace, maxCpuUsage, and minMemory.


Q.     What are the SRM failback options we see no button for failback which is confusing us?

 

SRM absolutely supports failback and each storage vendor documents the failback process for their specific replicated storage configuration. What you have to consider is that without SRM in your virtual environment you are back to manual and/or home grown scripts for DR you will no longer have automated Recovery Plans, no offline DR testing capabilities, and no DR audit trail. You can still failback manually without using SRM, high-level steps would be:

o    Delete the protection groups in the Protected Site vCenter

o    Unregister the protected virtual machines in the Protected Site vCenter

o    Work with your storage team, reverse data replication

o    VM re-inventory in Protected Site vCenter, restart and re-ip (manual or scripted)

 

With SRM in place you will have Recovery Plan(s), the ability to test failover before Recovery, and will have a built-in audit trail. SRM can also be used to help you failback once your primary site has been restored. The high-level steps would be:

o    Delete the protection groups in the Protected Site vCenter

o    Unregister the protected virtual machines in the Protected Site vCenter

o    Work with your storage team, reverse data replication

o    Leverage SRM, complete SRM workflows in the reverse direction from Recovery Site back to the Protected Site

 

Repeat the above steps from the Protected Site back to the Recovery Site to complete the re-protection of the virtual machines in the Protected Site.

 

I hope that has answered a few FAQs I am sure there will be more to come but for now,  thanks for stopping by!

Lee Dilworth

   



TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c328153ef0105368c3ec6970b

Listed below are links to weblogs that reference VMware Site Recovery Manager - "From general release to Update1, what have we learnt and what's new?":

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Great post! I walked into the fact to setup replication but not see any protected luns. Found out myself that it needs to have a vm in that lun. Cost me a couple of hours ;-)

Tomas
www.tendam.info

The comments to this entry are closed.

About This Blog



This blog has moved. For the latest posts please visit: blogs.vmware.com/vsphere/uptime/

Community


Discussions and resources for VMware Site Recovery Mgr (SRM)

Visit now



Facebook

YouTube


    VMware Blogs