Build Next Gen Apps VMware Tanzu Observability

Analyze DevOps Team Performance with a Metrics Pipeline Using Wavefront and Slackbot

Modern metrics monitoring platforms, such as Wavefront, can monitor thousands of containers or servers, and collect various metrics like application latencies, CPU utilization, memory consumption, and network errors to understand the health of an overall system. Beyond traditional application or system-level metrics, monitoring and analytics can also be applied to much more than how machines and applications are performing. According to the DevOps CAMS principle, how well a group of people interacts with each other is one of the main determinants of successful DevOps-driven, software delivery. In this blog, I will show you how I gathered metrics about DevOps team performance using a Slackbot, and then analyzed them in Wavefront.

Here Comes the Slack
Slack is a cloud-based collaboration platform that has become extremely popular among software development communities due to its simple, yet powerful, features that let people interact efficiently. Slack consists of users who exchange messages, a channel that distinguishes streams of communication under specific subjects, and applications or bots that can interact to ingest messages and auto-perform certain tasks to facilitate collaboration even more. What I found out is that, for a DevOps team that uses Slack as their main collaboration tool, one can estimate their build pipeline and team performance based on their messages exchanged.

For example, if there is a critical bug found in the latest software release, a customer will certainly report it into a ticketing tool such as Zendesk, triggering a webhook that will eventually reach the Zendesk channel in Slack. Then, ops and customer success teams will start talking about it, along with dev and sales engineering getting involved as necessary. Additionally, Zendesk tickets may also trigger to create JIRA tickets. If we listen to all these activities across multiple Slack channels, we can draw a comprehensive picture of how our DevOps people-processes are performing.

Turning your Slack Activities into Wavefront Metrics
So, just how do we convert the Slack conversation to metrics? A simple way is to create a Slackbot (software that would act as a user in Slack) that joins each channel of interest and listens to the messages created by users or other bots. As messages are all in some form of JSON messages, our Slackbot needs to be able to parse and retrieve the user ID and channel ID that a message is coming from, as well as the messages’ content and type.

If we had very sophisticated services to analyze the messages, we obviously could draw out additional and more advanced metrics. But for now, I’m simply going to extract the following metrics:

  • Message counts occurring for each user
  • Message counts occurring for each channel
  • Certain keyword counts which are previously defined
  • Once created, the Wavefront Slackbot will be logged into the Wavefront Slack workspace and monitor each user’s messages and activities. The Slackbot will then assess the messages, and based on the message content and type, will use the Wavefront web agent to submit the metrics to the Wavefront server. The web agent will also create events such as new version release to JIRA tickets, as well as Zendesk events, using the REST API of Wavefront. Also, based on the message, if a certain user is dealing with the customer or having a conversation with a certain ticket number, simple keywords will be scanned per each message and counted.

    I chose to develop the Wavefront Slackbot and web (or REST) agent on top of Spring framework, and developed both as Spring Boot applications. I used a jbot code to connect and listen from the Slack channels. I, then, developed a Spring Boot application with capability to connect to the Wavefront proxy and keep the metrics it received in its cache storage. Finally, I connected everything together running on top of our Pivotal Cloud Foundry (PCF) cluster environment.

    …And Wavefront Platform will Visualize It!
    So, basically, when the Slackbot starts to receive messages, the counts are tacked using the following metrics:

  • wavefront.devops.slack.bot.keyword.count
  • wavefront.devops.slack.bot.message.count
  • Each metric has a set of point tags such as user ID, channel ID, type of keyword, etc. The raw data looked like this.

    To make this information more useful, I needed to create an easy-to-use dashboard that would group each of the message counts by channel and username. I used Wavefront’s Query Language to sort the counts based on channel, user, and keyword. Then, I used aggregator functions to sum all the messages that appear on a given timeframe in the chart window. I created both stacked charts and tabular charts for easy reads of the trend and size of the accumulated messages over time. The resulting dashboard is below:

    The first ‘Activities Overview’ records the current rate at which users are producing messages in each channel. The blue dots represent specific DevOps build pipeline events such as new version release, bug filed or resolved, or new JIRA tickets as well as new trial accounts being created. We can see in this dashboard that significant messages were generated from ops channel, followed by integrations-dev. The users with the most messages, and message growth trends are depicted on the next chart. The third chart is the trend for popular keywords. Keywords include some of the customer names, as well as positive remarks such as SOLVED, GOOD, THANKS, or negative remarks such as PROBLEM, BUG, CRASH, etc. The top keywords roughly represent what kind of words are most frequently being exchanged throughout the channels.

    Using Variables to Filter Metrics
    Having a holistic picture greatly helps us to understand important build pipeline and performance trends for our DevOps team. However, I also wanted to see the message trends for each specific user or channel. The same Wavefront dashboard can be easily reused. Using variables, we can define the query based on certain point tag values such as channel id or username. For example, if I wanted to view a Slack trend of a specific user, I can use variables to only display his/her activities. This makes it easier for me to view how many messages a DevOps engineer posted on each channel, his/her rate of messaging over a period of interest, and what his or her favorite keywords were during that time.

    A similar filter can be applied using Wavefront’s variables to select a single channel and see its slack trends. In the next example, the query is limiting metrics for the ‘ops’ channel. We can easily see which user had the most messages posted at which time, and which keywords were most popular in that channel–giving us a good picture of which users were most active, and whether some users were collaborating with each other.

    If you have Slack and want to try out the examples outlined in this blog, sign up for a free Wavefront trial. I am looking forward to your feedback and can be reached @YooHoward on Twitter.