Tag Archives: metrics

How to Measure the Impact of Your IT Transformation

By Matt Denton

Matt_Photo1Generally when a company makes a decision to move in a new direction, a lot of analysis and rigor take place to ensure the decision is the right one. Business cases are created and vetted until everyone is in agreement and the project is approved. This is all great and necessary to kick off a new initiative. However, once the project is in motion, how often do we measure the results against the original business case to see if we are delivering on what the company expected?

Think about it. A project gets kicked off and everyone is heads down implementing the new changes and making sure they meet their deadlines. Going back to review a business case is usually not a priority and, quite frankly, who has the time? But at some point, senior leadership will ask for an analysis, and one will be created to meet that one-time request. Then, it is back to business as usual—until the next request comes along.

What if you could measure the impact IT transformation has on the business proactively and in real time? Projects become more meaningful. Employees can see how their work is impacting the business. Transformation begins to make sense and can be justified. This can be done if you take the time to generate key performance indicators and metrics ahead of time.

Start by asking the team these questions at the beginning of a project:

  1. Why are we doing this?
  2. What are we trying to improve?
  3. How will we measure it?
  4. What is our current state benchmark?
  5. What is our target?
  6. How will we impact the business if we reach our target state?
  7. Do we have data to measure progress?

What Metrics Matter Most?
Usually I see companies measure progress based on financial metrics. For example, did we save the company money? However, there are hundreds of metrics that relate to agility, cost, and quality. The key is to pick those that are most impactful to the processes you expect to improve as part of the transformation. These may not all be financially driven, but will still have a measurable impact on the business.

Below are some other areas where you can measure business impact:

  • IT financial management
  • Service level management
  • Demand management
  • Service desk management
  • Incident management
  • Problem management
  • Change management
  • Configuration management
  • Availability management
  • Continuity management
  • Release management
  • Capacity management
  • Security management

Some of the metrics that fall into these categories are what I refer to as the “hard to quantify” or “soft” benefits. These are generally thrown out or overlooked during the business case development. I believe that once you can quantify these, you can translate them into real benefits and measure their impact on the business.

Provided the data exists, I’ve been able to help many clients both track the metrics they decide to measure and demonstrate how they can show the impact IT transformation has on their company. By quantifying these metrics and showing the impact your improvements are making on the business, you will know at any given time if the changes you are undertaking are making a difference or if you are falling short of your expectations. And, you will also be able to identify if additional changes are required to meet the project’s objective.

Too often, I see clients lose focus on the reason they started a project. This is easy to do on long projects. People change roles, leadership changes, or other projects take priority. Putting metrics in place and understanding their impact on the business will help you maintain that focus. The qualitative data gathered during implementation and post implementation are important to measure the impact IT transformation has made on your business. The data you collect and analyze will begin to tell a story and allow you to make precise decisions on where additional improvements are needed to make the biggest impact.

======
Matt Denton is a VMware transformation architect and is based in Wisconsin.

Tips for Using KPIs to Filter Noise with vCenter Operations Manager

By: Michael Steinberg and Pierre Moncassin

Deploying monitoring tools effectively is both a science and an art. Monitoring provides vast amounts of data, but we also want to filter the truly useful information out of these data streams – and that can be a challenge. We know how important it is to set trigger points to get the most out of metrics. But deciding where exactly to set those points is a balancing act.

We all know this from daily experience. Think car alarms: If limits are set too tight, you can trigger an alarm without a serious cause. People get used to them. They become noise. On the other hand, if limits are too loose, the important events (like an actual break in) are missed, which reduces the value of the service that the alarm’s supposed to deliver.

Based on my conversations with customers, vCOps’ out-of-the-box default settings tend to be on the tight side, sometimes resulting in more alerts than are useful.

So how do you make sure that you get the useful alerts but not the noise? I’ve found that assigning Key Performance Indicators (KPIs) to each VM is the best way to filter the noise out. So this post offers some tips on how to optimally use KPIs.

First, Though, a Quick Refresher on KPIs

By default, vCOps collects data for all metrics every five minutes. As part of its normal operations, vCOps applies statistical algorithms to that data to detect anomalies in performance – KPIs are outputs from those algorithmic measurements.

Within vCOps, a metric is identified as a KPI when its level has a clear impact on infrastructure or application health. When a KPI metric is breached, the object it is assigned to will see its health score impacted.

A KPI breach can be triggered in the following ways:

  • The underlying metric exceeds a given value (Classic Threshold).
  • The underlying metric is less than a given value (Classic Threshold).
  • The underlying metric goes anomalous. This is a unique capability of vCOPs where a ‘normal’ range is automatically calculated so that abnormal values can be detected.

Typically, you would use one of these three options when setting a threshold, but combinations are also allowed. For example, you may want to set a classic threshold for disk utilization that exceeds a certain percentage. This can be combined with a dynamic threshold – where an alert is triggered if CPU utilization goes above its monthly average by more than x%.

Tips for Optimizing KPIs

KPIs provide the granular information that make up the overall health score of a component in the infrastructure, such as an application. The overall health score is a combination of statistics for Workload, Anomalies, and Faults.

Overly-sensitive KPI metrics, however, can cause health scores to decrease when there isn’t an underlying issue. In such instances, we need to optimize the configuration of vCOps so that the impact of anomalous metrics on health scores is mitigated.

Here are some ideas for how to do that:

Tip 1 – Focus on Metrics that Truly Impact Infrastructure Health

First, it’s good to limit the number of metrics you put in place.

With too many metrics, you’re likely to have too many alerts – and then you’re still in a situation analogous to having car alarms going off too often to be noticed.

Remember, overall health scores are impacted by any metric that moves outside its ‘normal’ range. vCOps calculates the ‘normal’ range based on historical data and its own algorithms.

Tip 2 – Define KPI Metrics that will Trigger Important Alerts

Next, you want the alerts that you do define to be significant. These are the alerts that impact objects important to business users.

For example, you could have a business application with a key dependency on a database tiers. An issue with a database or its performance would thus impact the user community immediately. To highlight these metrics, then, you’d want to focus on the set of metrics that can most closely monitor that database’s infrastructure setup KPIs.

Tip 3 – Use KPIs Across All Infrastructure Levels

In order to see the maximum benefit of KPI metrics, each metric should be assigned to the individual virtual infrastructure object (i.e. Virtual Machine), as well as any Tiers or Applications that the Virtual Machine relates to.

This is an important step as it makes the connection between the VM metrics and the application it relates to. For example, it may not be significant in itself that a VM is over-utilized (CPU usage over threshold), but it becomes important if the application it supports is impacted.

Example

Let’s assume a customer has a series of database VM servers that are used for various applications. The VM, Tier and Application assignments are illustrated below in the table.

VM Tier Application
orasrv1 DB WebApp1
orasrv2 DB CRMApp1
orasrv3 DB SvcDesk1

The application team has specified that the CPU Utilization for these VMs should not exceed 90% over three collection intervals (15 minutes). Therefore, our KPI metric is CPU Utilization %.

The KPI metric is assigned to all of the resources identified in the table above. Each VM has the KPI assigned to it. The DB Tier within each Application also has the KPI assigned to it. For example, the DB tier within the WebApp1 application is assigned a KPI for the orasrv1 VM. Finally, each Application also has the KPI assigned to it. For example, the WebApp1 application is assigned a KPI for the orasrv1 VM.

With these assignments, health scores for the VMs, Tiers and Applications will all be impacted when the CPU Utilization for the respective VM is over 90% for 15 minutes. Virtualization administrators can then accurately tell application administrators when their Application health is being impacted by a KPI metric.

Take-Away

When it comes to KPI alerts, there are 3 steps you can take to help “filter the noise” in vCOPs.

1)   Focus on a small number of metrics that truly impact infrastructure health.

2)   Define KPI metrics that will trigger the important alerts.

3)   Set up these KPI metrics consistently across infrastructure levels (eg VM, Application, DB), so that issues are not missed any particular level.

For future updates, follow @VMwareCloudOps on Twitter, and join the conversation by using the #CloudOps and #SDDC hashtags.