Are you one of the ones that cannot live with the manual nature of the failback operation in SRM 4?  You need that Failback button that you have heard about in SRM 5?  And you didn’t find it did you!  I will help you with that in this blog and hopefully it all makes sense.  You will be doing a simplified and automated failover back to the original site soon so no worries!

BTW, again I have issues with the size of my screen shots.  If you don't see the edge of the pictures below you can click on them for the full image.

When you are in the main Recovery Plans part of the SRM UI you see something like the following.

  Rpstatus copy

A close up on the buttons in the top right shows there is no failback button – or is there?


It is important to understand that failback is a marketing term, and not a feature.

Failback requires you to first recover on the recovery side.  Than you reprotect and this means you are now protected (in the reverse direction).  Replication is going in the opposite direction, and you are ready to do a recovery or test, but towards the original side.

This looks like this:


It is important to note above you only see the Reprotect button as clickable when you have compelted the recovery.  If for some reason your recovery has not completed you will not be able to reprotect so try running the recovery again.  It will not recovery anything that it has already successfully recovered.

Recovery complete

Now that our virtual machines have failed over to the recovery side, and are in fact running there, we need to reprotect them.

We use the Reprotect button for that.

 After we start the Reprotect actions, we get to watch it as the reprotect actually happens.  Notice how it reverses the storage replication, and than it protects the VMs again?


Once this is done we are protected once again.  If our original protection was Site A with VMs running on it, we now have our VMs running on Site B.  But if anything happens to Site B we can recovery safely on Site A.

Now we can (and should) test things so that when we do a recovery back to the original site it will work properly.

Once our test recoveries are all good, we do a planned migration back to where we came from.

Notice how your data will be synchronized back to the original site as part of this planned migration?  Very handy as in the past you would have had to work in several different interfaces to manage this!

So in summary, failback means: failover, reprotect, test, and failover.  This will get you to the recovery site and back again!

Thanks for reading, and make sure to leave comments if you have any questions!

Update 7/21/11 – Just added a short note above that mentions the reportect button is only available if the recovery has successfully completed.