vRealize Operations vRealize Suite

Self-Driving Compliance Remediation with vRealize Operations 7.5

I have spoken to many customers who have been using the vRealize Operations Management Pack for vRealize Orchestrator to add remediation actions using their own workflows to vRealize Operations. This provides practically limitless capability to automate routine, first-level runbook tasks and configuration management for SDDC components.  With vRealize Operations 7.5, we jointly released the 3.0 version of the management pack which includes a couple of new features, but I am particularly excited about a new set of workflows included that provide ESXi host compliance remediation.

Workflow Alert Awareness

With the previous version of the management pack, you could add your own custom workflows to the action framework in vRealize Operations. If you want to fully automate the action, by having it automatically run the workflow when an alert is triggered, your custom workflow must have only a single input of some vSphere object type (virtual machine, host, cluster, etc).

With 3.0 you can now add a second, optional, input for fully automated workflow actions. The alert ID can now be provided to the workflow so that more intelligent decisions can be added to your custom workflows. For example, you may want to get more details about high CPU usage alert such as the symptoms triggered to determine if adding more CPU is warranted. Or, in the case of a workflow that is providing some notifications via email, text or service desk, you may want to pass along the alert start time and reference the impacted object properties to determine whom, exactly, to inform.

In fact, as you’ll see later in this blog post, the alert ID is used by the included ESXi host compliance remediation workflow.

Host Compliance Remediation

vRealize Operations has provided compliance templates for the vSphere Security Configuration Guide for some time now. If you enable this, you get compliance alerts for any vSphere objects which have violations of the guide. The alert contains a list of symptoms, which are the specific settings causing the object (a host in this case) to be out of compliance.

That’s great information to have but look at all those symptoms! And just from one ESXi host. How long would it take to manually fix that? What if you make a mistake while changing configurations on a host and break something?

Why not just fix it for me?

Well, in 3.0 we have included a remediation workflow for 6.5 and 6.7 ESXi hosts to do just that!

The workflow “Apply Host Security Configuration Rules” gives you the ability to correct most of the alert symptoms for a Security Guide alert on an ESXi host.

Running the action passes the ESXi host and the alert ID (as described above) to the workflow. The workflow evaluates the alert symptoms to determine which remediations should be run.

If you wish, you can also fully automate this action so that it runs when the alert is triggered. Let’s dive a little deeper into this new workflow.

Getting Started ESXi Host Configuration Remediation

When you install the management pack, it pushes a workflow package to your vRealize Orchestrator instance which includes a bunch of helpful workflows that are automatically associated with vRealize Operations actions.

In the folder “vSphere Security Configuration Guide” you will find a “Configuration” folder and in it a workflow “Configure Host Security Config Data” which will allow you to control which symptoms get enforced by the workflow, how you will be notified and connection information for your vRealize Operations cluster.

For instance, maybe you are perfectly fine with automated remediation of password policy settings, but you would rather not make any networking changes without consulting with the security or networking team.

You can always adjust the configuration later if you like. For most of the settings, the recommended values from the Security Configuration Guide are used. For some of the remediation settings additional information is required. For example, NTP remediation requires the addresses of the servers you would like to use.

When the workflow is run, and you have elected to receive email notifications, recipients will see an email for any ESXi hosts remediated. The email provides details on the active symptoms, what remediation was taken (and if it was successful), and lists remediations that were not automated (or could not be automated).

Just a note, there are a couple of the symptoms that are not enforced, either due to complexity (for Firewall Restrict Access) or because there is not an API available to vRealize Orchestrator to handle the remediation (for Configure SNMP). Those are indicated by “Not enforceable” and will require manual remediation.

Extending Remediation

By now, you’re likely thinking that it would be great to have this remediation available to other vSphere objects. For this reason, we have started a Sample Exchange for vRealize Orchestrator workflows that can be used with the management pack. We already have contributions from the VMware field for networking compliance remediation that can be used with the 3.0 version of the management pack! Certainly, there will be more to come, and hopefully some contributions from you!

To see all the vRealize Operations content samples, including dashboards and super metrics, visit vrealize.vmware.com and explore!