Introduction

When it comes to system administration and operations, we all know it can sometimes be tricky to get the visibility into your environments at the granularity and verbosity level that you want. No one wants a call at 3am to say a NIC flapped, unless, for example, it’s happening across 10 hosts within a short period of time.

It is also true that no two companies monitor and operate their environments the same way, so finding solutions that work with your particular setup can be tricky. vRealize Log Insight (which comes with 25 free licenses with all vCenter editions) allows you to set up webhooks to send arbitrary messages to any web service based on alerting conditions you define within the product, this is extremely powerful and essentially allows you to route alerts to any service you can think of.

In this post, we’ll take a look at how you can set up vRLI to send alerts to Slack (or any other service of choice) based on particular vSAN states and log messages. This can, of course, be extended to other services like PagerDuty, ServiceNow, or indeed any of the plugins listed here.

What you need

This is pretty simple to set up, but we’re going to outline the requirements here – we assume you have the following already set up and operating within your environment:

vSAN
vRealize Log Insight
Slack
A linux box with Docker installed.

In addition to the above, we are going to deploy a very simple container that acts as the aggregation layer for all these services, it allows us to take the message fired by vRLI and translate it into whatever service you want the message to go to without having to do any coding.

The how

The container

You’ll need a linux box to run the container on, or if you’re running vSphere Integrated Containers it can run there, a Kubernetes cluster, PKS, etc. In this particular example, I am simply running the container on an Ubuntu VM for brevity and simplicity.

SSH into the linux box and pull down the latest version of the vmware/webhook-shims container:

docker pull vmware/webhook-shims

1	docker pull vmware/webhook-shims

With the container image pulled down locally, let’s deploy it and set it to always restart if it fails:

docker run -d -p 5001:5001 vmware/webhook-shims --restart=always

1	docker run -d -p 5001:5001 vmware/webhook-shims --restart=always

The above command, when broken down does the following; Runs the container as a daemon process (in the background), maps port 5001 of the container to port 5001 of the host machine and restarts the container on all failures or host restarts.

Ensure the container is running by issuing the below command:

myles@docker01:~$ docker ps
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS                NAMES
5f5dab820dc6        vmware/webhook-shims   "/root/webhook-shims…"   4 months ago        Up 14 minutes       0.0.0.0:5001->5001/tcp   practical_mcnulty

myles@docker01:~$ docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

5f5dab820dc6 vmware/webhook-shims "/root/webhook-shims…" 4 months ago Up 14 minutes 0.0.0.0:5001->5001/tcp practical_mcnulty

At this point, if we visit the host’s IP on port 5001 we will see a webpage like the below that contains the setup instructions for each plugin – this confirms the container is working as expected:

Slack

With the container set up, you will need to create an “Incoming Webhook” app for your slack team – instructions for doing so can be found here (follow steps 2 and 3). At the end of step 3 you should be furnished with a URL in the below format:

https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX

1	https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX

Take note of the three sections in the URL, beginning with,T, B and the random last string, as we will need them later on.

vRealize Log Insight

Within vRLI’s “Interactive Analytics” tab, click the Alerts icon and then “Create new alert”. Give your alert a name – I called this one “Slack Webhook” as we can route these alerts to multiple services and have discrete alerts set up for each.

Check the “webhook” checkbox and fill in the URL with the following format:

http://[your container IP]:5001/endpoint/slack/Txxxxxxx/Bxxxxxxxx/xxxxxxxxxxx

1	http://[your container IP]:5001/endpoint/slack/Txxxxxxx/Bxxxxxxxx/xxxxxxxxxxx

Substitute in the IP address of your container, as well as the sections of the Slack URL from above.

As an example, let’s say my container host’s IP is 192.168.0.100 and my Slack URL is https://hooks.slack.com/services/T123456789/B87654321/ABCKSFSHDFKNGSDIGDFG then my webhook URL in vRLI would be the below:

http://192.168.0.100:5001/endpoint/slack/T123456789/B87654321/ABCKSFSHDFKNGSDIGDFG

1	http://192.168.0.100:5001/endpoint/slack/T123456789/B87654321/ABCKSFSHDFKNGSDIGDFG

And set up whatever alerting frequency you would like as the last item.

From here, click the Alerts icon again and then “Manage Alerts”, find the “Slack Webhook” alert we just created and click the “edit” button. At the bottom of the dialogue box that pops up you will see an “Edit Query” button – This is where you define the conditions that the alert will be raised to.

The query shown below is based on one of the built-in vRLI queries and looks for component state changes where a component changes from active to any other state (like degraded, stale, absent, etc):

With your alert defined – click “Save” on the bottom right-hand side of the window. Now that the alert is defined and has a query associated with it, all that needs to happen is for the query to match some results from the syslog it ingests – if you used the above example, put a host into maintenance mode and you will receive your alerts straight into the Slack instance you defined. Here’s a clip from our test lab:

And that’s it, an end-to-end solution to get your vRLI alerts into Slack, but as mentioned at the start – there are many services supported by the webhook-shims container and we encourage you to try these out for other things like automatic ticketing in ServiceNow, paging with PagerDuty, kicking off Jenkins jobs or even vRO workflows.

Can’t get enough of vSAN? Follow us on Twitter and Facebook!

Introduction

What you need

The how

The container

Slack

vRealize Log Insight

Related Articles

Wrap up of VMware vSAN at VMware Explore 2023

Tech Zone Blog Updates Highlight: Exploring the Latest Innovation in VMware vSAN

Top Ten VMware Explore Storage Sessions