vCenter Operations vRealize Operations

vCenter Operations Unplugged – Understanding Workload & Health

Virtualization Admins have a tough job.  Every day they face tough questions like: Is there a performance problem in my virtual infrastructure (VI)? Is it an ESX population issue? Is it a single VM gone haywire? What VMs are being affected? If you’re a VI Admin then you know what I’m talking about when I say these are the questions that start flying when you’re pulled into a large conference room during a service outage and all the finger pointing begins.

vCenter Operations has a lot of great value props going for it – now with the Best of Interop (Overall) and Cloud Computing Category Winner award, I am seeing even more interest in it. So I am starting this ‘vCenter Operations Unplugged‘ series of posts to explain the cool features in the product – hopefully answering the several questions I’m fielding everyday at the same time. In this first post, let’s start with the Workload and Health scores and how you can leverage these together to understand the health and wellbeing of your VI.

No one will argue with the fact that vCenter Server is great for gathering virtual infrastructure data/metrics.  Question is, what do they mean?  A bunch of metrics alone does not make identifying and troubleshooting a performance problem very easy.  What vCenter Operations does is take 1,000s of metrics from vCenter Server and bubbles them up into 3 actionable higher level super-metrics or badges for Workload, Health and Capacity.  These are critical pieces of info that would help any VI admin.

Capacity is pretty simple.  How much time do I have left before I run out of a given resource based on my usage trends?  Valuable, but not what I want to focus on today.

Let’s drill down and talk about Workload.  Workload, simply put, is a measurement showing the ratio of the resource demand of a virtual object (VM, ESX, Cluster, etc) versus the amount of resources it can obtain.  The resulting ratio is a score that will help you understand how hard a virtual object is working.  If the Workload score is low (e.g. 20) an object has plenty of resources at its disposal.  The higher the Workload score the closer a virtual object gets to running out of its necessary resources.  A Workload score above 100 (yes, it can be >100) means the virtual object is starving for resources (CPU, Memory, Network I/O or Storage I/O).

Workload

Health is a bit unique.  It measures how normal a virtual object is acting based on previous behaviors and grades it on a 0-100 scale.  A Health score of 100 means everything is behaving normally.  A lower Health score indicates the virtual object is acting abnormally (not necessarily unhealthily) and might need your attention.  I’ll go into the details of how Health is calculated in an upcoming blog entry because it is really very interesting, but for now let’s leave it at that.

Health

You can use these two super-metrics together to drive best-practices in your VI.

  • High Workload/Low Health – A virtual object is starving for resources and is acting abnormally (e.g., it usually has plenty of resources at its disposal at this time).  This situation requires your immediate attention because services that are normally running smoothly are being affected.
  • Normal Workload/Low Health – A virtual object has plenty of resources, but something about it is acting differently than usual (i.e. maybe someone has made a change to the VM configurations).  This may not be a problem, but it definitely requires further investigation within vCenter Operations to determine what actions to take.
  • High Workload/High Health – A virtual object is running hot and may be starving for resources, but the Health is high so this is normal behavior for this object during this part of the day (i.e., a scheduled batch job is running, or its morning and all of the users are starting to log in for the day, etc.).  Users may have been complaining about system responses or they may have just gotten use to the speeds provided.  Regardless you should start to think about providing more resources.

Hope that’s a helpful first post to kick-off this ‘vCenter Operations Unplugged‘ series. If you want to download a free trial version of vCenter Operations just click here: https://www.vmware.com/tryvmware/?p=vcenter-ops&lp=1. As always, your feedback is most welcome.