It is important during a test failover to test your virtual
machines without them being visible to the production systems. Things like IP and name conflicts can
happen if you are not careful and they will ruin your day. So isolating those VM’s is important
but we need to do that in a way that we can still properly test.
If you have only one ESX server you can use the
automatically generated test bubble network that connects the VM’s
together. This private virtual
switch will allow VM’s to talk to each other without the traffic leaving that
switch. So we are preserving the
important isolation. This is the
default for a recovery plan. It is
shown as Auto in the Test Network column when you edit the Network part of a
recovery plan. You can see this in
the screen shot below. You can
learn more about this on page 62 of the Admin Guide – URL for it is below.
When you have multiple hosts, and you need VM’s on each of
the hosts to communicate with each other we cannot use the private virtual
switch method as it has too much isolation (it does not span hosts). The solution to that is to use an
Category Archives: vCenter Site Recovery Manager
I've been seeing lots of chatter about Site Recovery Manager lately. Books, videos, all sorts of good stuff. Tomas Ten Dam is doing a great job of covering a lot of these as well as doing yeoman's work on his SRM in a Box project; check out his blog for more.
Here are some of the recent docs there: VI:OPS: Community: Availability.
- Steps to setup FalconStor NSS Virtual Appliances for VMware Site Recovery Manager
- Steps To Create a 2-Site SRM Demo Environment on a Laptop
- Steps to setup LeftHand Networks VSA for VMware Site Recovery Manager Steps to setup EMC Celerra (iSCSI) for VMware Site Recovery Manager
- Steps to setup EMC Clariions for VMware Site Recovery Manager
- Steps to setup NetApp arrays for VMware Site Recovery Manager
As VMware's Greg Lato says, SRM "is the Easy Button for data center level disaster recovery."
Our new blog Uptime continues to deliver the goods. Two new posts from VMware's own Lee Dilworth cover vCenter Site Recovery Manager from the overview (if you're still trying to get a handle on what SRM does and how it replaces that paper DR plan and set of scripts you've never really fully tested) to the first FAQ and set of tips if you've been trying it out.
Next, Lee's FAQ & tips on SRM:
Link: VMware Site Recovery Manager – "From general release to Update1, what have we learnt and what's new?". Lee goes into a bit more detail, but here's sample question:
Q: What are the SRM failback options we see no button for failback which is
SRM absolutely supports failback and each
storage vendor documents the failback process for their specific replicated
storage configuration. What you have to consider is that without SRM in your
virtual environment you are back to manual and/or home grown scripts for DR you
will no longer have automated Recovery Plans, no offline DR testing
capabilities, and no DR audit trail.
The newest blog from the VMware stables has left the starting gate. The blog, entitled Uptime, will cover business continuity, high availability, and disaster recovery. Welcome!
We continue to see a lot of interest and questions about how to protect VMware environments as well as a lot of excitement about the new and future technologies that VMware has developed and talked about, so we wanted to create a place where we can give you some additional insight into what we’re seeing and working on here at VMware. This blog will focus on products and solutions for business continuity in virtualized environments. We’ll talk about data protection, high availability, and disaster recovery solutions that include VMware Infrastructure and products like VMware Consolidated Backup, High Availability, Site Recovery Manager.
The first update to Site Recovery Manager has been released.
Here’s the What’s new section from the release notes:
- New Permission Required to Run a Recovery Plan
SRM now distinguishes between permission to test a recovery plan and
permission to run a recovery plan. After an SRM server is updated to
this release, existing users of that server who had permission to run a
recovery plan no longer have that permission. You must grant Run
permission to these users after the update is complete. Until you do,
no user can run a recovery plan. (Permission to test a recovery plan is
unaffected by the update.)
- Full Support for RDM devices
SRM now provides full support for virtual machines that use raw disk
mapping (RDM) devices. This enables support of several new
configurations, including Microsoft Cluster Server. (Virtual machine
templates cannot use RDM devices.)
- Batch IP Property Customization
This release of SRM includes a tool that allows you to specify IP
properties (network settings) for any or all of the virtual machines in
a recovery plan by editing a comma-separated-value (csv) file that the
- Limits Checking and Enforcement
A single SRM server can support up to 500 protected virtual machines
and 150 protection groups. This release of SRM prevents you from
exceeding those limits when you create a new protection group. If a
configuration created in an earlier release of SRM exceeds these
limits, SRM displays a warning, but allows the configuration to operate.
- Improved Support for Virtual Machines that Span Multiple Datastores.
This release provides improved support for virtual machines whose disks reside on multiple datastores.
- Single Action to Reconfigure Protection for Multiple Virtual Machines
This release introduces a Configure All button that applies existing inventory mappings to all virtual machines that have a status of Not Configured.
- Simplified Log Collection
This release introduces new utilities that retrieve log and
configuration files from the server and collect them in a compressed
(zipped) folder on your desktop.
- Improved Acceptance of Non-ASCII Characters
non-ASCII characters are now allowed in many fields during installation and operation.
You can download the update here .
Duncan over at Yellow Bricks has some words of wisdom for your BCDR project.
There a whole bunch of SRM projects going on globally where VMware PSO,
the department I work for, is assisting. These projects typically have
a duration of 3 to 9 months, while it seems that with the ease of
VMware Site Recovery Manager this should be a matter of days.
People tend to forget that the most important thing about Distaster Recovery / Business Continuity is the business. You need to know the organisation and IT environment very well before you can even start …
The fact that SRM is so
easy to setup makes it really hard to actually explain to a customer
why a BCDR project will take much longer then he expected.
There are a few sets of instructions floating around the Internet on how to run ESX or ESXi inside Workstation 6.5. (Let me Google that for you or just go to xtravirt) Lots of reasons you’d want to do this — for training, testing, lab work, demos, POCs, or even just as a parlor trick to impress your friends. You’ll need recent hardware. Now David Davis has published a nice 14 minute video tutorial on the topic at Petri IT Knowledgebase. Link: Running VMware ESX 3.5 and ESXi in Workstation on your desktop PC.
Site Recovery Manager can be hard to evaluate — you need some shared storage that is going to be replicated and then set up SRM to do all the tricky failover workflow bits. Tomas Ten Dam has laid out a process to set that up in Workstation as well using the NetApp ONTAP simulator: SRM in a Box final release (the complete setup) « Ten Dam. (Looks like you need to be a current NetApp customer to get your hands on it. You should also be able to do this with the EMC Celerra simulator,
same conditions apply. Looks like you can do SRM with Lefthand VSA as well, and you can at least do that with a 30-day trial. Has anybody set this up with a free or open source, albeit unsupported, tool? How about a set of virtual appliances?)
Completely new to SRM? Check out this new video (parts 2 and 3 coming soon).
[Update: from Chad Sakac in the comments, the Celerra simulator is available to everybody.]
via the new blog, VMGuy, from VMware SE Dave.
I have more testing to do but can report that I’m starting 4 VMs
from a single replicated LUN in 8 minutes. And I’m not talking about
from the time of just powering on, I’m talking about pressing the "big
red (test) button" – powering-up the VMs – starting the Windows
services – and the recovery plan completion. Try that using physical
servers! Sorry, but even restoring servers from a B2D solution that’s
replicated to your DR site won’t be as fast.
I demonstrated SRM for the DR team and initially got a "that’s all?"
kind of reaction. I quickly realized that SRM, with the combination of
array-based replication, +worked too well+! Meaning, it did such a good
job of hiding the complexity and number of steps required to get from A
to Z that my non-technical DR teammates didn’t understand what SRM was
really bringing to the table. If there’s only one thing you take away
from this article, make sure it’s that you’re better off explaining in
simple terms the steps SRM is executing in the background before
running a demonstration.
Talking about the virtues of SRM is one thing (the recovery run book,
the steps it automates, the testing capabilities (which are awesome
by-the-way), etc.), demonstrating these product features for your DR
team is another. If your experience is like mine, you’ll find it
dramatically influences the discussions on