Memory metrics for Virtual Machines have changed in recent releases of vRealize Operations to make managing your SDDC better. In this blog, I will explain these changes to help you understand how they help you. To start things off, let’s define a couple of important metrics.
By definition, Active Memory is the “amount of memory that is actively used, as estimated by VMkernel based on recently touched memory pages.” Since a virtual machine isn’t constantly touching every memory page, this metric essentially represents how aggressive the virtual machine is using the memory allocated to it. What this means is that memory utilization, as seen from within the guest OS via Task Manager or top, will almost always be greater than Active Memory. Refer to Understanding vSphere Active Memory if you want to read a more detailed explanation of Active Memory. In vRealize Operations, this metric is called Memory|Non Zero Active (KB).
By definition, Consumed Memory is the “amount of host physical memory consumed by a virtual machine, host, or cluster.” That article also states that “VM consumed memory = memory granted – memory saved due to memory sharing.“ What this means is Consumed Memory can include memory that the guest OS considers free. If you compare Task Manager or top to Consumed Memory, you will see that Consumed Memory is almost always larger. In vRealize Operations, this metric is called Memory|Consumed (KB).
Here is a screenshot comparing Task Manager with Memory|Non Zero Active (KB) and Memory|Consumed (KB).
Now that we know what Active and Consumed memory are and how they relate to what the guest OS shows, it’s time for a short history lesson of vRealize Operations (and you thought you were done with history after high school).
vRealize Operations 6.6.1 and Older
vRealize Operations 6.6.1 and earlier relied on Active Memory when calculating utilization and demand. This meant that memory utilization always appeared lower than what you see in the guest OS. Active Memory was used by the capacity engine, which meant that sizing recommendations were also based on Active Memory. What this meant was the recommendations were usually quite aggressive.
vRealize Operations 6.3
The release of vRealize Operations 6.3 brought support for collecting in-guest metrics via VMware Tools. These metrics weren’t used by any vRealize Operations content (yet), but they were available for you to use. This was an awesome addition because it gave additional visibility into the guest’s perspective without needing another agent. As you can see from the list of metrics below, this meant memory utilization was now available. Note that not all these metrics are collected by default, but you can enable the one you need using policies.
Guest metrics added in vRealize Operations 6.3:
- Guest|Active File Cache Memory (KB)
- Guest|Context Swap Rate per second
- Guest|Free Memory (KB)
- Guest|Huge Page Size (KB)
- Guest|Needed Memory (KB)
- Guest|Page In Rate per second
- Guest|Page Out Rate per second
- Guest|Page Size (KB)
- Guest|Physically Usable Memory (KB)
- Guest|Remaning Swap Space (KB)
- Guest|Total Huge Pages
vRealize Operations 6.7
The release of vRealize Operations 6.7 was a milestone release because it really helped improve usability and simplify how you use the product. There are a few critical changes related to memory monitoring, such as a brand-new capacity engine and the elimination of redundant and unnecessary metrics. The most important change, related to memory metrics, is it utilizes the Guest|Needed Memory (KB) metric, which is collected via VMware Tools that was added in vRealize Operations 6.3. This change was made to greatly improve the quality of projections from the capacity engine as well as rightsizing.
There are some situations where the guest memory metrics can’t make it to vRealize Operations such as VMware Tools not being installed, running an unsupported version, etc. Knowing that the data may not always be available, Consumed Memory is used for failback. Consumed Memory was selected as the failback metric because, as shown above, it’s more conservative than Active Memory. The primary metrics affected by these changes are Memory|Usage (%) and Memory|Utilization (KB).
Typically, you would see that Guest|Needed Memory (KB) and Memory|Utilization (KB) are nearly identical (unless there is an issue collecting the metric from VMware Tools). If there is an issue collecting Guest|Needed Memory (KB), you will see that it correlates with Memory|Consumed (KB) instead.
Memory|Utilization (KB) is the metric used by the capacity engine and therefore rightsizing recommendations. As you can see, it’s advantageous to ensure that Guest|Needed Memory (KB) is collecting from VMware Tools to get the best quality recommendations.
By now, I’m sure you’re wondering about the actual formula used. If guest metrics from VMware Tools are collecting, Memory|Utilization (KB) = Guest|Needed Memory (KB) + ( Guest|Page In Rate per second * Guest|Page Size (KB) ) + Memory|Total Capacity (KB) – Guest|Physically Usable Memory (KB). If guest metrics from VMware Tools are not collecting, Memory|Utilization (KB) = Memory|Consumed (KB).
vRealize Operations 7.0, 7.5, and 8.0
With the release of vRealize Operations 7.0, there was a tweak made to Memory|Usage (%) based on customer feedback. Memory|Usage (%) was changed to prefer Guest|Needed Memory (KB) from VMware Tools, but it now fails back to Memory|Non Zero Active (KB) if it’s not available. This change allows you to use Memory|Usage (%) to show an aggressive percentage and Memory|Workload (%) to show a conservative percentage in dashboards and reports.
Memory|Utilization (KB) remains unchanged from vRealize Operations 6.7. Memory|Utilization (KB) is still the metric used by the capacity engine and rightsizing recommendations. Again, it’s important to ensure that Guest|Needed Memory (KB) is collecting from VMware Tools to get the best quality recommendations from vRealize Operations.
Now that you know the history, I’m sure you’re wondering how to ensure it’s working optimally. As you can see, there are many components needed for the feature to work. It’s important to ensure each of these requirements are met.
- vCenter Server 6.0 U1, vCenter Server 6.5 U3, vCenter Server 6.7 U3, or newer
- ESXi 6.0 U1 or newer
- Ensure the vRealize Operations Manager VMware vSphere adapter credentials have the Performance > Modify intervals privilege enabled in the target vCenter(s). See Minimum Collection User Permissions in vRealize Operations Manager 6.x and later for more information.
- VMware Tools 10.3.2 Build 10338 or newer for Windows
- VMware Tools 9.10.5 Build 9541 or newer for 64-bit Linux
- VMware Tools 10.3.5 Build 10341 or newer for 32-bit Linux
- vCenter Server 6.0 U1, vCenter Server 6.5 older than U3, and vCenter Server 6.7 older than U3 may require disconnecting and reconnecting the host from vCenter as mentioned in KB 55675
I realize the list of requirements is long and it can be challenging to track in large environments. To help, I’ve created a dashboard to help you identify VMs that aren’t collecting memory from the guest OS. You can find the dashboard along with install instructions on the vRealize Operations Dashboard Sample Exchange site.
Here’s to better managing your memory and your capacity!
Edit (Febuary 6, 2019): Added Performance > Modify intervals privilege requirement for VMware vSphere adapter credentials.
Edit (July 11, 2019): Added vCenter Server 6.5 U3 to Validation section as not needing the workaround mentioned in KB 55675.
Edit (September 12, 2019): Added vCenter 6.7 U3 to Validation section as not needing the workaround mentioned in KB 55675.
Edit (December 11, 2019): Added vRealize Operations 8.0.