In our lab environment we have multiple vCenters and provide numerous services to our developers for testing purposes. Being a lab environment means something is always down and when that happens with a vital component such as a vCenter we usually get bombarded with the same question over and over – “Is it working?”. We do monitor the entire environment using vRealize Operations Manager however not all team members have access or knowledge to check and filter through all the alerts that we have (and we have a lot of those).
I’ve heard that there is a new trend in other organizations. People there are informing their users about outages and other impactful events using public health dashboards. One such dashboard or status page is the self-hosted and open source Cachet. Usually in those organizations the system administrator or devops person would open the status page and mark every service as degraded or down anytime there is an outage. For me that sounds great and all, but I don`t have the time nor the desire to go and manually keep a dashboard updated and we already have all this info in vROps. Can’t we just automate the filtering, creation and update of statuses and incidents on our dashboard?
Well since both vROps and Cachet have good public APIs yes we can and it’s not that hard. This post will help you do just that – automatically generate a public health dashboard for VMs, applications, vCenters and anything else that can be monitored via vROps. Just like the one in the screenshot below. We will do that with the help of Cachet and a little script that I’ve put together for this. You are of course free to take the logic and implement it in your favorite scripting language or even in vRealize Orchestrator workflow.
Figure 1: End result
First you will need to install and configure Cachet. I am not going to get into details how to do that since the Cachet developers have written an excellent how to guide on the subject.
After we have installed and configured Cachet we will need to configure a vROps custom group in which we will put all the objects that will appear on our health dashboard.
You can create one from Environment > Custom groups > +
For our integration the name or the custom group type are irrelevant. I am naming mine Cachet and using group type “Environment”.
Next you will need to decide how the group members will be populated. I have opted for a static population – I must go and include a member every time I want something new to appear in Cachet. You can however use vSphere tags or other tools to automate the population. More info about the vROps custom groups can be found in this excellent blog post from Blue Medora or the official documentation.
Figure 2: Custom group configuration
Now that we have both Cachet and vROps ready we can download the script, configure it according to the installation guide and schedule it to run on regular intervals so it can keep the dashboard updated. Every time you add or delete a member of the vROps custom group the script will sync that change in Cachet. It will also set the object availability status according to the availability metric in vROps and include any alerts raised for our objects based on the filtering by type and criticality that we have configured.
Note: The vROpsPHD integration script doesn’t make any modifications to your vROps instance. It just collects data from vROps and manages Cachet.