KPIs and Metrics, Part III: Examples and Samples

This blog is the last of a three-part series on KPIs and Metrics, which covers:

Tangible and intangible KPIs, metrics and sources
Relevant examples of KPIs used for tracking success
Product/Service-specific examples
Sample dashboard designs

Intro

The first blog of this series discussed the definitions and value of KPIs and metrics in that they help us make decisions and take actions. Also included were some best practices around aligning them with expectations, ensuring their actionability, presenting them as trends versus thresholds, and how they should be communicated. The second blog presented a plan for how you can establish best practices KPIs, what some of the challenges are, and the importance of executive sponsorship. In this post, we will cover some examples both from an executive management level as well as some product views.

Tangible and Intangible KPIs

Simply put, tangible benefits result in hard, direct currency savings. For example, converging multiple disparate servers into a software-defined data center (SDDC) converged system will result in optimized, reduced hardware and facilities costs through the conservation of power, cooling, and support costs. Those reduced costs will become immediately apparent as support, maintenance and utility bills are lower. These effects will be seen in accounts payable (AP) and reduced support time accounting FTE hours records.

Intangible benefits can be likened to “softer” currency. These indirect savings can be generated through maturing the organization, improving processes, and building stronger services governance. Will those changes have a material effect? Absolutely, but not as direct as the tangible ones. For example, standardizing a service catalog enables resourcing through economies of scale, as well as less effort to automate those services. These result in less supplier costs as well as reduced FTE hours. They are material, enabled by a service strategy change, but require more time and follow up activities to realize.

Success KPIs

In the first blog, we talked about reporting tiers of KPIs of which the top layer is executive reporting. Here we’ll start by addressing some of these executive KPIs. Later, we will get to some Management and Operational/Practitioner ones.

*Figure 1. Sample IT Executive Dashboard*

KPIs executives are concerned about that reflect the success of IT are:

% OpEx – Reducing operating expenses reflecting the impact of initiatives on the budget over time. There should be the ability to drill down (best practice) into specific categories. For example, Infrastructure %OpEx change, year over year, should be able to be drilled down to leased compute, storage, network, external services, facilities, FTEs, etc. Most of the sources for these KPIs come from enterprise financial systems.

% SLA Compliance – Addresses adherence to service level agreements (SLAs) as well as savings through increased optimization and stability. It is also an indirect indication of customer satisfaction. These KPIs often relate to availability, transaction times, dedicated resources, and FTE-to-services ratios. The data sources are mostly resource utilization, performance / monitoring, and time accounting data.

% Compliance – Could be considered more on the soft dollar side, but we have consistently found that vendors are regularly auditing and fining their customers for licensing issues. Similarly, governments are penalizing companies for mismanaging customer data. These KPIs relate to the change, year over year, in passed audits and reduced penalties and fines. Input data sources are financial and audit systems.

% CSAT – Customer satisfaction is highly industry-related and overall is difficult to measure as most data is incident related. Periodic surveys addressing customer needs are a good approach to fill in the gaps. Other indicators are around customer service such as quick resolution and provisioning. In some industries, customers vote with their pocketbook. Satisfaction can be determined by the % of orders over % views, consistent % participation and % utilization.

Besides customer surveys, incident systems provide good empirical data in terms of the number of incidents, their reduction, improvement in how long it takes to resolve them, etc. Improvements in these areas lead to increased customer satisfaction.

% TTM – Speeding up business applications’ time to market is enabled through faster provisioning of resources and enabling more agile development and release capabilities. Here, once again, the incident systems can infer decreased time to market through the reduced time it takes to provision requests. Other sources may be PMO reports or Scrum reports where the speed of delivering applications and changes can be measured.

% Shadow IT – A key indicator of customer satisfaction and effectiveness of IT addressing business needs is the reduction of non-IT brokered costs. That is to say, the employment of IT resources directly by the lines of business outside of IT’s knowledge, standards, and involvement. This is often difficult to measure given its stealth nature – often paid for through other LOB general ledger accounts. However, we have found that in reviewing accounts payable data for the key shadow IT provisioning vendors, such as Amazon, Azure, Softlayer, Alibaba, TCS, and Cognizant, will lead to insights as to the level of shadow IT. Seeing the change in increased usage of internal IT resources and reduction of shadow IT would be a good indicator of successful benefit realization.

Product Specific Examples

The executive-level success KPIs require data from management and operational sources. Let’s take the executive KPI of % SLA Compliance and determine how it can be composed of management level and operational level ones. SLAs usually include commitments around infrastructure, application workload availability, end-to-end transaction time, system health, capacity, and controlled costs. In examining sources for these metrics and operational KPIs we find that VMware vRealize Operations (vROPS) has a good deal of them, especially when combined with VMware ITSM integrations and Professional Services Transformation Consulting. The following tables have just a few selected KPIs and metrics from vROPS and ITSM services which can be used to report on SLAs.

Selected vROPS KPIs	Relevance
% Utilization	Activity of the environment over time shows usage. Gaps indicate availability issues
% Application Healthy	Takes identified application’s environments and measures overall performance and throughput. One-stop shopping for reporting on application-tagged resources.
% Datastore Healthy	Takes identified datastores and measures overall performance
% vSAN Healthy	Evaluates the performance and throughput of the storage environment
% Host Healthy	Evaluates the performance (memory, CPU, utilization, network) of the VM hosts
% VMs Healthy	Evaluates overall performance of VMs
% VMs undersized	Determination of VMs which may need more resources (CPU, memory, bandwidth) needed to perform better
% VMs oversized	Determination of VMs which are not using their resources and therefore, costing the consumer more
Cost per VM	The cost per average VM. This number can serve as an index into the efficiency of IT when compared to industry or commercial comparative rates.

Other, non-VMware sources of reporting on %SLA compliance will be provided by the service management systems incident and problem systems:

ITSM KPIs & Metrics	Relevance
% Performance Issues	Performance incidents reported by environment, application, etc.
% Application Issues, # Issues by Application	Application incidents which may have resulted in lack of availability and performance
% MTTR, TTR	Mean time to recover, time to recover accounts for the gap in the application/workload availability.
% Available	Reported outages by environment, application

The resulting formula for % SLA for infrastructure is straightforward as it comes from the service management system’s % Available. However, the formula for an applications’ SLA adherence is more complex. In the following approach one could, per the executive-level example, substitute BOSS, eComm or apps in the brackets:

[Application] SLA compliance = % [Application] Healthy

[Application] SLA compliance = (Committed SLA time – down time due to % [Application] Issues – unscheduled down time due to % Environment Available) / total time.

Dashboard Designs

Executive

Executive dashboards are quick summaries indicating the status as well as answering some questions up front as displayed in Figure 1. There, we saw on the Audits icon, a warning status and a note indicating a $20k fine. Here is another example of an IT executive dashboard focusing on storage for a large pharmaceutical company in which there are summary images and current status displayed. (Compliments to Michael Little.)

Management

As mentioned in the first blog’s best practices section, management dashboards should indicate directional progress of KPIs and metrics so that management can make decisions about what actions to take.

In this example, we have taken the executive KPIs mentioned earlier and drilled down to them over a 7-month, year-to-date dashboard. Note, for example, in the SLA breakdown we see that the back office systems (BOSS) and eCommerce applications consistently are below the SLA availability standards even though the infrastructure has been stable. This, when reviewing the incident tickets will probably reveal application error issues.

Operational/Practitioner

Operational level of dashboards can be very detailed with minute by minute events taking place. Most products offer far more metrics then one would want to display or consume. Line managers of these systems need to determine what is meaningful and indicative of their services to select. Here are some good relevant examples using vROPS and ITSM.

*Figure 4. Sample of vROPS Operations Level Cluster Dashboard*

*Figure 5. Sample Operations Level ITSM Dashboard*

Note the granularity of time. Any actions that might be taken based on these is immediate and more related to performance than to longer range management level that the managements dashboards would determine.

Summary

In this last of our KPI blog series, we presented some real-world examples of what IT executive management tracks and how they can drill down to management and operational reporting. Not all of these items are directly tangible but might have indirect, though material impacts on cost and customer satisfaction. We selected some KPIs and metrics out of VMware vRealize Operations as well as VMware Professional Services Transformation Consulting practices and presented them on sample executive, management and operational dashboards as examples of our best practices of understanding your audience and having actionable messaging.

VMware Professional Services can help you identify and understand how to incorporate KPIs and metrics in your existing processes. Contact your VMware representative to learn more.