In my last article in this series, Understanding and Configuring Alerts, I covered the various types of alerts and how you can respond to alerts in vCenter Operations Manager. One of the many types of alerts you could receive in vC Ops may be a capacity alert. Capacity alerts tell you that you are running low on some form of virtual infrastructure resource such as CPU, memory, or storage I/O. Or, what if instead of receiving an alert, you just wanted to proactively review the current virtual infrastructure capacity available and how long vCenter Operations Manager predicts before that capacity is depleted. To help us gain insight into capacity planning with vCenter Operations Manager, read on!
Capacity Planning Worst-Case Scenario
Before we delve into how vC Ops can help you perform capacity planning, it’s important to first consider – what would happen if we DIDN’T do capacity planning? Of course, the obvious answer is that you would run out of capacity but what happens then? Many of us don’t think about it too much unless we’ve had the unfortunate experience of trying to figure out what happened to a virtual infrastructure that “just stopped working one day”.
Figure 1 – Capacity Planning and Analysis Tabs
With virtualization we smartly abstracted, pooled, and are on our way to automation (at most companies). We gain far greater utilization and efficiently from our pooled compute, storage, and network resources. However, if we run out of any of those resources (or sub-resources like CPU, memory, storage capacity, or storage throughput), we have the potential to suffer degraded performance and, potentially, complete datacenter-wide outage and downtime for ALL applications. Of course, that is the worst case scenario where, say, your infrastructure is just the “right size” that you are able to store all virtual machines on a single SAN LUN and that LUN runs out of storage capacity, effecting all VMs. Or, you run all VMs across a cluster of 3 hosts and memory utilization reaches the point where applications degrade to the point of being unusable (more likely it might be a single critical VM resource like DNS, a shared file system, or shared database server that degrades to the point of making other applications unusable). These scenarios remind us “not to keep all eggs in one basket”.
However, vSphere offers numerous ways to prevent this system-wide performance degradation and outages. For example, built-in alarms when storage reaches critical thresholds (but doesn’t send an email by default), distributed resource scheduler, resource pools, and memory optimization techniques (transparent page sharing, ballooning, swapping, and compression). Plus, in most virtual infrastructures of a moderate size, there are multiple SAN LUNs or arrays in use, multiple hosts, multiple clusters, and perhaps multiple resource pool (of course best practice is to create a more isolate management cluster to ensure management always stays up).
In reality, in larger environments, you’ll rarely experience the “worst-case scenario” but, more likely, you’ll experience performance degradation of specific applications and, potentially, even downtime for critical tier-1 applications.
We don’t want any of those things, do we? And it’s our job to prevent them by using an intelligent capacity planning tool like vCenter Operations Manager.
Introduction to Capacity Planning With vCenter Operations Manager From the World View
Thankfully, vCenter Operations Manager can make vSphere capacity planning so easy you don’t have to think about the multiple aspects of capacity and how long before you’ll run out of each.
To clarify, capacity management can be broken down into-
- Analyzing – focused on identifying immediate or longer term immediate capacity issues
- Optimizing – “right-sizing” virtual machines to maximize resource utilization and minimize waste
- Forecasting – traditional capacity planning answer questions like “how much time before I run out of X resource?”
In this blog, we are focused on analyzing immediate and long term capacity planning issues, which is done with forecasting, but we won’t be using the vC Ops “what-if scenario” feature (of which I’ll cover in a future blog post).
Most vCenter Operations Manager capacity planning is done on either the Analysis or Planning tabs. You’ll find these tabs at just about every point of the vC Ops inventory tree.
As with any vC Ops data that you view, it’s all context-sensitive based on where you are located in the inventory tree. For example, you’ll receive different capacity analysis reports if you are at the World view vs an ESXi host or virtual machine.
From the vC Ops world view planning tab (shown in Figure 1, above), you can plan capacity for the vC Ops “world”, which can include multiple virtual infrastructures. In the case of my lab, Figure 1, we can see the trending and forecasting of the number of remaining clusters, hosts, and and virtual machines that can be added to our world. In fact, in the case of this world, with all the recent activity, vC Ops is telling me that, at the current rate that I have been adding VMs, I only have 2 days left before I’m totally out of capacity for new VMs. Of course, these statistics are slightly misleading because they are based on vC Ops having a short-term view (I ended up having to reinstall a few days ago due to hardware failure) and because I’ve added many more VMs than normal over the past few days. Thus, while my world view is temporarily out of whack, this is an important starting point for capacity planning in your virtual infrastructure because it’s here that you can find information like
- Average VM & host CPU effective demand
- Average VM & host CPU allocation
- Average VM & host memory effective demand
- Average VM & host memory memory allocation
- Average VM & datastore disk space total used
- Average VM & datastore disk space allocation
As you can see in Figure 2 below, all of these statistics are available for the last 2 days, last day, current day, next day, next week, next month, and next quarter.
As you can see, in the case of my lab, vC Ops is predicting that I will run out of capacity for VMs in 2 days based on my increasing trend of host and VM memory utilization, as shown in Figure 2 and 3.
Figure 3 – World Level Resource Trending for Host Physical Resources
We can modify these reports by tweaking how they are aggregated (average by cluster, host, or VM) and modify the perspective (used, remaining, and capacity).
There is a ton of useful information here for single or multi-infrastructure capacity planning and we are just still on the summary section of the planning tab.
If we move to the Views section of the planning tab, we get a long list of additional capacity planning information (shown in Figure 4), which can be filtered by vC Ops badge types – time remaining, capacity, stress, waste, and density.
Figure 4 – World Level Resource Planning With Report Views
With over 20 reports at the world view, you’ll be able to report on things like CPU, memory, disk space, disk IO read, disk IO write, network, and more. Each of these can be viewed at in terms of total capacity, usable capacity, capacity remaining, and time remaining.
As you can see from Figure 4, I just have 8% memory capacity remaining, providing me available VM capacity for just .54 virtual machines (not even 1).
Some capacity planning reports provide report data, as you see in Figure 4, while other reports provide trending reports, as you see in Figure 5, below-
Figure 5 – Host Level Resource Planning With Trending Report View Graphs
Understanding capacity planning at the vCenter, virtual datacenter, cluster, host, and VM level is easy once you are familiar with the capacity planning summary, view, and events sections of the planning tab as the same tabs and sections are are available at each of those levels with similar capacity planning information, relevant to that level.
More so than the others, the datastore view is likely the capacity planning view that offers the most different types of capacity planning data. As you see in Figure 6, below,
Figure 6 – Datastore Level Resource Planning Views
Another unique section of vC Ops is found when you go to the events section, down at the host level. As you can see in Figure 7, the host level capacity planning view offers graphing of events in the virtual infrastructure as well as vC Ops risk, time, capacity, stress, efficiency, waste, and density events.
Figure 7 – Host Level Resource Planning Events
vCenter Operations Manager is a powerful capacity planning tool. Even if all you did was to look at the planning tab, you can learn so much about current and trending capacity utilization across multiple datacenters, vCenters, clusters, hosts, VMs, and datastores. In future blog posts, we’ll look at the capacity analysis tab and the power of vC Ops what-if scenario capacity planning.
How has vC Ops helped you with capacity planning? Share your story in the comments below!