I have been asked before about which of the SRM alarms should users configure and watch for. There is a lot of different alarms, and I suspect no one needs all of them, but I also suspect everyone will need a few of them. I will help you get started with what I think are important and mostly standard alarms. I will also give you ideas of what will trigger them if there is any doubt.
Lets dive in!
The first thing we need to do is to make sure that vCenter can send email. This is done in the Administration \ vCenter Server Settings area. Once there, select the Mail option. See the screen below for what it should look like, and where to add in your SMTP server and the sender account.
Note that if you are working in Linked Mode, like I am, you can change your vCenter server (from the drop down list at the top of the screen) so you can easily make the change to both sides. This is important since you will need to configure SRM alarms on both sides.
Once your vCenter can send email we are ready to check out the alarms.
You can find the alarms, in SRM, on the Alarms tab, which is next to the Summary tab. See below for what it looks like.
Remember that these alarms exist on both the protected, and recovery sites. You need to think about where to configure alarms. For example, if you configure the Remote Side Down alarm to email you when it triggers, and you configured it on the Protected side will it alert you when the Protected side is down? No, it will only alert you when the recovery side is down!
What alarms are important?
Remote Site Down - this watches the remote side, which should be the protected site, and will fire when the SRM service stops running. This is NOT really a good safe way to know if you need to trigger a DR event, but it is still good info.
Recovery Plan Destroyed - since you lose history when you delete a recovery plan, most users I know don't like to lose the recovery plans.
Recovery Plan Started / Recovery Plan Execute Test begin are good in the beginning but not really necessary later.
VM Added is an important one. It means that a VM has just been copied to a LUN that is protected by SRM. This means if someone who should know better, migrates a VM to a protected LUN you will be alerted on it!
VM Not Protected is also an important one. It will fire if you add a VMDK to a VM and it is held on a non replicated LUN. Or you attach a CD / ISO to the VM. So a protected VM where something is done to it that makes it NOT protected. Again, very good to know about.
Recovery Profile Prompt Display means that the recovery plan has stopped, and is waiting for you to to confirm something. So good to know!
Recovery Plan Prompt Response - this is one that may not be as important but it will fire when a waiting prompt is acknowledged.
Protection VM Limit Exceeded - this will alert you when you are protecting more VM's than you have licensed. Good to know!
Each alarm does have a description field if you need a little reminder of what they do.
The alarms above in bold are the particularly useful ones.
Below is an example of what configured alarms look like.
How do you configure alarms?
You double - click on the alarm you are interested in. Than you change the to the Action tab. You will have a choice of sending an email, sending a trap, or run a script. See below for an example.
That's it. You now have alarms configured to send email when they fire.
Important Note: make sure to test any of these alarms that you configure for email before you use them in production.
Any questions or comments just let me know!