Application and Infrastructure Management

SRM value prop & SRM failback from Chad Sakac

EMC’s Chad Sakac goes into SRM failback (essentially running SRM back from the Recovery Site to your original site), but in the midst of this talks about the value proposition of SRM when, after all, many of the required steps can just be scripted, right?

Link: Virtual Geek: A Few Technical Threads – Part 3: SRM Failback.

  1. SRM exactly automates those steps.    Automation in a DR
    situation is everything.   Buildings will be burning or sprinklers
    running, and cellphones will be ringing.  It’s not the time for complex manual operations. 
  2. Could
    it be manually scripted?   Sure.   Who will maintain that script?   
    Traditionally – DR was reserved generally for mainframes and other
    things deemed "mission critical" enough for expensive Disaster
    Recovery.   In those cases, the environments are VERY static – so the
    idea of creating a DR plan, refreshing it and testing it once a year at
    a multi-million dollar cost was reasonable.   VMware is different, and
    SRM brings DR to a whole new use case.   This same week, I talked to a
    customer who is adding 100 VMs a week on their infrastructure.   Heck,
    even if you’re doing 1 a week, will you update that script constantly? 
  3. They
    tested a single VM booting.  Yeah!   They have 400 VMs today.   First
    of all, who’s going to manually register all those VMs.   More
    importantly – what is are the DEPENDENCIES between the VMs?   There is
    a specific start sequence needed, or your entire DR plan will not work.
    I’m always interested in IT how projects needing cross-domain expertise
    are hard, because everyone trivializes everyone else’s work or
    complexity.   AD and DNS, then Exchange/SQL Server, then Sharepoint –
    and somewhere in their, your hundreds of other VMs – in a specific
    start sequence.   Who will figure out the specific start
    dependencies the first time, and how will that be maintained in this
    uber-script?   SRM help, and come to AD3500 at VMworld to find out what
    EMC is doing to make this easy.
     
  4. The tested booting the VM on an isolated vswitch.   The IP addressing scheme at the remote site was totally different.   What will update all the IPs?   Update DNS?   Do any hosts hardcode IPs rather than use DNS names… anywhere?   
  5. The
    test (including the one they did) is a useless test unless it is an
    END-TO-END test.   Otherwise, you have told your management that you
    are ready, and the unthinkable happens and you have failed them, you’ve
    failed your business, and you’ve failed yourself.  In other words, a successful "pseudo test" which leads to "we have it figured out" unless you REALLY test – GUARANTEES FAILURE.