posted

6 Comments

This is a question which popped up time and time again at VMworld 2011, in SRM sessions, Storage DRS sessions and general storage sessions. I made some "educated guesses" about what would need to be done if a VM was Storage vMotion'ed from one replicated datastore to another replicated datastore on the protected site. I guessed that one would have to edit the protection group for the VM that has been migrated and re-protect it. However the proof is in the pudding as the say, so I tried a Storage vMotion of a protected VM to a non-replicated datastore in the lab. My lab setup had two VMs residing on one of EMC's Uber-Celerra appliances which was replicated to another Uber-Celerra appliance at my recovery site:

Prot-group-ready
I then initiated the Storage vMotion operation to move one of the VMs to another datastore which was not replicated. First off, you do get a warning if you try to migrate a protected VM, which is good to know:

Warning

I chose to ignore this and continued with my Storage vMotion operation. The Storage vMotion operation succeeded, but when it completed, a Recompute Datastore Groups task kicked off. I then went back to my Protection Group, and now I saw that only 1 out of 2 VMs was in an OK state; the other VM was in a Not Configured state and would require editing in order to protect it.

Prot-group-not-config
So as you can see, you will need to re-protect VMs which have been Storage vMotion'ed (or moved as part of Storage DRS balancing). This is an additional work-flow step for SRM administrators.

But what about supportability?

I put the question about supporting SRM with Storage vMotion & Storage DRS directly to the Site Recovery Manager (SRM) Product Managers. This is what they had to say:

We know of specific cases where DR plans can be compromised during storage movement. As such, for SRM 5, we will be officially declaring no support for either Storage vMotion or Storage DRS. This will be clearly stated in the release notes for SRM 5. This is absolutely not ideal, but given the current state of the products, we have to be clear that this functionality is not supported in order to a) protect our customers, and b) ensure that GSS is not required to support situations where (worst case scenario) recovery is not possible in a disaster situation.

How could DR plans become compromised?

Well, if a customer enables Storage DRS in fully automated mode on the protected site, and at 3am in the morning, Storage DRS decides that it needs to balance the datastores (either for space or for I/O load), and it Storage vMotions a VM to a different datastore, that VM is no longer protected by SRM. Let's say that at 4am, there is a disaster at the protected site. SRM does its thing and fails over to the recovery site. Unfortunately, not all the VMs are recovered because some of them were migrated to different datastores at the protected site, and were left in an unprotected state. This is not a nice situation to be in during a disaster.

Could I not just run Storage DRS in manual mode?

Yes, you could, but even if you are running Storage DRS in the manual mode, after the VM has moved to another datastore, you still have to manually re-protect it. So there is a period of time there as well where the VM is unprotected. Also, you need to ensure that all datastores in the datastore cluster are replicated. But since Storage DRS has no idea if the datastores are replicated are not, if you make a mistake with the populating of the datastore cluster, Storage DRS will happily place a VM onto a non-replicated datastore. Finally, there will be customers who will put Storage DRS into automatic mode, and when something goes wrong with the SRM failover, VMware's GSS will still be expected to pick up the pieces, even though we explicitly called out that this was unsupported. Its for these reasons that we have decided not to support interoperability between Storage DRS & SRM, even if Storage DRS is in manual mode.

What about vSphere Replication?

vSphere Replication is a new feature of SRM 5.0, and allows Virtual Machines be protected across different sites without the need for storage array replication. However, with vSphere Replication, you still need to specify a datastore at the protected site and the recovery site when configuring Virtual Machines for protection. Because of this, all the same concerns that I raised about array based replication previously are also applicable to vSphere Replication. Here is a screenshot of where the datastores must be specified when configuring vSphere Replication:

Vr-setup-1
My understanding therefore is that neither Storage vMotion nor Storage DRS can be used with Array Based Replication or vSphere Replication in Site Recovery Manager.

What are VMware doing to address this going forward?

We are tracking full interoperability of Storage vMotion & Storage DRS with SRM as a high priority item. We cannot state when we will have this interoperability however.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter VMwareStorage

About the Author

Cormac Hogan

Cormac Hogan is a Senior Staff Engineer in the Office of the CTO in the Storage and Availability Business Unit (SABU) at VMware. He has been with VMware since April 2005 and has previously held roles in VMware’s Technical Marketing and Technical Support organizations. He has written a number of storage related white papers and have given numerous presentations on storage best practices and vSphere storage features. He is also the co-author of the “Essential Virtual SAN” book published by VMware Press.