14405594 - stock exchange graph
VCF Operations

My Top 15 vRealize Operations Super Metrics

The real power of vRealize Operations is unlocked when custom contents (i.e.  dashboards, views and reports) are created for addressing specific use cases. But, the most common ask from customers is, “When do you need to create a custom dashboard or view or report and what are the best practices for achieving the best results?” I would suggest following a prescriptive approach for creating custom content, starting with identifying the right audience, mode of presentation, object or resource which is the focus of the content and finally the information to be published. Please  refer to my TAM Customer Deep-Dive session at VMworld Barcelona 2019 – TAM3517E – TAM Customer Central session at VMworld Europe 2019 – for detailed information on this. One of the most critical components in this process is to identify the appropriate metric and if not available, look at the possibility of creating a super Metric.

In this article, I will be focusing on how to create a super metric from the simplest to some complex examples with the relevant use cases for each. Each of these was built based on requirements from customers and hence tested and deployed in production.

What is a Super Metric?

A super metric is a mathematical formula that contains a combination of one or more metrics for one or more objects. Once you create a super metric, it persists over time and can be used for different interactions just like any other metric. It can be used in views, reports, dashboards and symptom definitions.

Super metrics are created with a list of operators and functions – the updated list of these are available in super metric functions and operators. These can be further combined and operated using several conditional expressions like ‘where’ or ‘if-else’ ( Refer to Enhancing Super Metrics )

Over the last few releases, there have been several enhancements focused on creating and working on super metrics more easily. With autocomplete in the new editor starting vRealize operations 7.5, you simply start typing an object name or object type, and the Super Metric Editor will pop up a list of options that match. You can keep typing to refine the list and then use Enter to select and finalize the super metric formula.

Checklist Before Creating a Super Metric

Before starting to create a super metric, please make sure the following points are identified:

  1. Objects or object types that are involved
  2. Metrics which will need to be used.
  3. How to combine the metrics? ( which operator, function or expression to use?)
  4. Which object type will be used to assign super metric?
  5. Policy in which super metric will need to be enabled

I have picked 15 samples of super metrics which I have used and listed them below, starting from the simplest one and possible examples with most commonly used operators, functions and expressions.

Let us go through each of these with use cases.

Super Metric # 1: A simple static value as output

Use Case  : VMware Licenses Usage dashboard – The customer needs to have a single dashboard to show the current licenses available and usage for ESXi hosts, vCenter Instances VSAN, NSX, Horizon and vRealize Suite.

All of these licenses usage are available in different portals, and we can find an alternative method to identify them as metrics in vRealize Operations. Examples are below:

  • vSphere licenses usage = Count of CPU sockets from vSphere World
  • vCenter licenses usage = Count of vCenter instances in vSphere world
  • VSAN licenses usage = Count of CPU Sockets on ESXi hosts filtered for VSAN cluster(s)
  • NSX licenses usage = Count of CPU Sockets on ESXi hosts filtered for VSAN cluster(s)
  • Horizon Licenses usage = Count of Sessions or users ( Based on Named user or CCU licenses)

If these need to be compared with available licenses in the same dashboard, we will need to have them also as metrics so that it can be published in a scoreboard widget. Since these are mostly static values, usually changing infrequently as hardware is added or removed, we can get this by creating a super metric to provide a fixed value as output. This can be considered as the most straightforward and primary super metric as it involves just list a numerical value as output.

Follow steps below for creating super metric :

Step 1: Go to Administration > Configuration > Super Metrics . Select add to create a new super metric

Step 2: Start with a meaningful name for the super metric and description. You may also assign a unit from the available drop-down list, if required. (available only from v8.1)

Step 3: Create Formula – where you can add expressions which constitute super metrics. In this example, as we need just a constant value, please type in the number as in the screenshot below.

Other options available in the page are:

  • Preview: Shows the values of a super metric against any object without needing to save and apply.
  • Legacy: This mode switches to the template to create a super metric formula without the suggestive text as it used to be in v7.0 or older.

Step 4 : Add the object(s) to which the super metric needs to be applied. In this case, we will use to “vSphere World” as we are looking at viewing licenses of the entire environment.

Step 5 : The last step of creating super metric is to apply to relevant policy or policies

Once the super metric is enabled, it will take one or two collection cycles ( 5-10 minutes) to start seeing the super metric for the object. For the assigned object (vSphere World in this case), you will see the super metrics under All Metrics – Super Metric – see screenshot below:

By utilizing the above metrics and super metrics, the following dashboard can be created:

A sample dashboard with above super metrics is available in https://code.vmware.com/samples?id=7487

Super Metric#2: Utilizing ‘Maximum’ or ‘max’ function with super metrics.

Use case: The customer has a requirement to proactively monitor the performance of business-critical workloads and report to applications owners. Here are the specifics of the SLA:

  1. Maximum CPU Ready % among the VMs within a custom group of VMs or a specific cluster needs to be reported.
  2. Maximum Memory Balloon % among the VMs within a custom group of VMs or a particular cluster needs to be notified.

Refer to a quick video on creating this super metric – Creating Super Metric for Max CPU Ready% and find below the formula:

Maximum CPU Ready% among Virtual Machines:

max(${adaptertype=VMWARE, objecttype=VirtualMachine, metric=cpu|readyPct, depth=3})

Maximum Memory  Balloon% among Virtual Machines:

max(${adaptertype=VMWARE, objecttype=VirtualMachine, metric=mem|balloonPct, depth=3})

Super Metric#3: SUM and Average functions within super metrics.

Use case: Proactive monitoring (Capacity and Performance) dashboards need to be published for specific application groups. Metrics to be published includes the sum of vCPUs and Memory provisioned on all virtual machines within a custom group.

As you will find that Virtual Machine specific metrics are not available at the custom group level, super metrics need to be created to sum all the vCPUs from virtual machines within the custom group and apply it to the relevant “container” object.

Sum of vCPUs provisioned on all virtual machines in a custom group:

sum(${adaptertype=VMWARE, objecttype=VirtualMachine, metric=cpu|corecount_provisioned, depth=3})

Sum of  Memory provisioned (in GB) on all Virtual Machines in a custom group:

sum(${adaptertype=VMWARE, objecttype=VirtualMachine, metric=mem|guest_provisioned, depth=3})/1048576

Average  of  CPU Usage% with all Virtual Machines in a custom group:

avg(${adaptertype=VMWARE, objecttype=VirtualMachine, metric=cpu|usage_average, depth=3})

Super Metric# 4: Using ‘where’ condition with numeric filter

Use case: As part of infrastructure capacity planning and reporting, the team needs to know the count of virtual machines which have CPU Usage % greater than 60%

count(${adaptertype=VMWARE, objecttype=VirtualMachine, metric=cpu|usage_average, depth=8, where=($value > 60)})

Note: above formula works with v8.1 and later only

Super Metric# 5: Using ‘where’ condition with string filter

Use case  : As part of Infrastructure capacity planning and reporting , the team needs to know the count of virtual machines with Windows based operating system

More examples of similar use cases are available in https://code.vmware.com/samples/6185

Super Metric# 6: Using ‘where’ condition combined with another operator

Use case: The infrastructure team wants to proactively monitor performance and identify the relevant metrics as part of SLA  reporting. One of the requirements is to identify the percentage of virtual machines with CPU Ready% higher than 1%.

Using the ‘where’ condition, we can create a super metric to count the number of virtual machines which have CPU Ready% greater than 1% and then find the percentage of this count against the total number of virtual machines.

Super Metric# 7: Using ‘where’ condition with numeric value combined with additional conditions.

Use case: Find the count of heavily utilized virtual machines either for CPU or  Memory. To create a super metric to find the count of virtual machines with CPU usage% greater than 70% OR memory usage% greater than 60%

count(${adaptertype=VMWARE, objecttype=VirtualMachine, metric=cpu|usage_average, depth=8, where=($value > 70 || ${metric=mem|usage_average } > 60)})

Super Metric# 8: Using ‘where’ condition with string value combined with additional conditions.

Use case: As part of the infrastructure capacity planning and reporting, the customer wants to count the number of Windows-based virtual machines which are powered-on

Note: above formula works with v8.1 and later only

Super Metric #9: Using ‘where’ condition with multiple conditions for same metric

Use case: The customer wants to count the number of Virtual machines which are not Windows based or Redhat based.

Super Metric# 10: Using Conditional expressions  “If” , “else”

Use case: The customer has a specific requirement of getting notifications when vCPU count is changed on specific virtual machines. So they wanted to create a super metric which will check the count of provisioned vCPUs and if it is equal to 4, return value “1” and if it is not equal to 4, return a value “0”. An alert definition was created to trigger when this super metric returns value of “0” which would mean that count of vCPU is changed.

count(${this, metric=cpu|corecount_provisioned, depth=1, where= ($value == 4)}) ? 1 : 0

Note: above formula works with v8.1 and later only

Super Metric# 11: Using Conditional expressions  “If” , “else” combined with other operators

Use Case: While using the Rightsizing feature, vRealize Operations provide the vCPUs / Memory to be removed or added based on if it is an oversized or undersized virtual machine. The customer will then need to use this value to add to or subtract from the provisioned vCPU or Memory to get the “Actual Recommended vCPU or Memory”. For management reports and dashboards, it is optimal to have this actual recommended value rather than the resources to be removed.

Below logic was used to create a super metric to get the “Actual Recommended Value”(this is not the actual super metric formula):

If the value of Recommended vCPUs to add  is equal to 0 ,

then Actual Recommended vCPU = Provisioned vCPUs – Recommended vCPUs to remove (as an Oversized VM ) ,

or else  Actual Recommended vCPU = Provisioned vCPUs + Recommended vCPUs to add ( as an Undersized VM )

Refer to https://code.vmware.com/samples/6645 and https://code.vmware.com/samples/6709 for more details on this.

Super Metric# 12: Complex super metric combining multiple operators and conditions

Use Case: To find VM Availability %. As Iwan Rahabok mentions in his blog, it is very complicated to calculate the availability of a VM. In below formula, we use functions like “ceil’ (find the smallest integer that is greater than or equal to a referred value), ‘min’ (minimum value from collected samples) and ‘max’ (Maximum value from collected samples):

${this,  metric=sys|poweredOn} * ceil ( min ([ max ([ ${this, metric=sys|osUptime_latest} ,${this, metric=mem|nonzero_active} ,${this, metric=virtualDisk:Aggregate of all instances|commandsAveraged_average, depth=1} ,${this, metric=net|usage_average} ]) , 1 ]) ) * min ([ 300 ,   ${this, metric=sys|osUptime_latest} ]) / 3

The detailed clarification on calculating VM availability is available at Iwan Rahabok’s blog:

http://virtual-red-dot.info/vm-availability-monitoring

Based on the same considerations, I have also created  a super metric  for Host Availability%:

${this,metric=sys|poweredOn} * ceil ( min ([ max ([ ${this, metric=sys|uptime_latest} , ${this, metric=mem|workload} , ${this, metric=disk:Aggregate of all instances|commandsAveraged_average} , ${this, metric=net|usage_average} ]) , 1 ]) ) * min ([ 300 , ${this, metric=sys|uptime_latest} ]) / 3

(Note: Above formula follows the same logic as used for Virtual Machine Availability%, and there could be different views on how to look at Host Availability% like adding checks if any VM is running on the host or Is the host on Maintenance mode. Here it checks for CPU, Memory, Disk and Network usage on the host and if it is identified as UP, will use the uptime metric to calculate Availability% for 5 minutes. I am using this formula primarily to demonstrate how super metric formulas can be updated to be applied to different objects)

Super Metric# 13: More Complex super metric adding if-else condition to previous metric

Use Case: The customer is using the super metric to find Host Availability % as in the  previous example and is using the same for SLA reporting. When hosts are shut down for regular maintenance activity, it reports zero availability even though the outage was planned. This impacts their overall SLA values. So they do not want to consider zero value if the host is shut down for maintenance.

For this use case, the custom group was created to add hosts which are placed in maintenance mode, and there is a custom property assigned on this group which provides a value “InMaintenance.” So any hosts moved to this custom group will have this custom property set. As per the formula below, if the host has this custom property value, it returns Availability % as 100, if not it will use the actual formula to calculate Availability %:

Super Metric# 14: Super metric combining ‘where’ condition and ‘if-else’

Use case: This was a unique case where a customer had outsourced their IT Operations, Infra Admins wanted to monitor VM reboots done by their managed services team. So, they wanted to be alerted when a VM has VM Tools running, but VM uptime is less than 5 minutes.

To achieve this, we created a super metric to check if VM Tools running status and if it is running, return the value of OS uptime, if not return value of zero.

Note: above formula works with v8.1 and later only

This super metric is further used to create a symptom definition and then alert if the value is above is less than 300 seconds (5 minutes).

Super Metric# 15:  Super metrics using operators on other super metrics.

Use case: The customer wants to create a sustainability dashboard to show power consumption and carbon emissions before and after virtualization. Also, they want to showcase how many trees are to be planted to offset the environmental damage done through carbon emission by idle VMs.

vRealize Operations has power metrics which were used to create several super metrics required to build these dashboards. Detailed information is available in my blog- Sustainability dashboards in vRealize Operations – Find out how much did you contribute to a Greener Planet ?

I created my first super metric in 2012 when one of my customers came up with a specific use case, thanks to E1 or Iwan Rahabok’s blog articles, and for being my mentor. Over the last 8 years, I have created hundreds of super metrics lot of them with the guidance from the experts in the product team- Brandon Gordon, John Dias and Artavazd Amirkhanyan. Thank you for your guidance and support helping me to extensively use the capabilities of super metrics.

All the above super metrics were presented in a recent TAM Lab session ( VMware TAM Labs  provide in-depth technology workshop sessions led by VMware Technical Account Managers (TAMs) to enable a culture of learning and partnership across the VMware organization and our customers) . You may find the recording of this session published in TAM Lab Youtube channel – TAM Lab 073 – Learn Super Metrics and be a vRealize Operations Hero

In Summary

Super metrics are one of the most powerful features within vRealize Operations, which helps in expanding the use cases which can be addressed. Super Metrics are made with mathematical expressions – how simple or complex it could be depends on the use cases. So please do not stop with the out of the box metrics only, explore the options with super metrics as you require.

You may find the repository of super metrics in https://code.vmware.com/samples

Learn super metrics and be a vRealize Operations Hero!!!

Varghese Philipose is currently working as Staff Technical Account Manager in METNA region based in Dubai, been part of the team since 2012. He is a member of CTO Ambassador Program under Office of CTO within VMware since 2018. He is also Member of vROps Ambassador Program 2020 and also vExpert Cloud management 2020. Have been a speaker at VMworld Europe 2018 and 2019. His Certifications include VCAP-DCV 2021, VCP on Datacenter Virtualization, Network Virtualization, Cloud Management & Desktop Virtualization, vExpert (2019- 2021 ), vRealize Operations Specialist 2019 & Cloud Services Provider Specialist 2019. He is very passionate about working on vRealize Operations and leads an internal program to drive adoption and value-realization with vRealize Operations among TAM customers in EMEA. Being recognized as a specialist on vRealize operations, he has delivered several workshops and sessions for customers in APJ & EMEA along with Product Management team.


Discover more from VMware Cloud Foundation (VCF) Blog

Subscribe to get the latest posts sent to your email.