By: Greg Hohertz, Blue Medora
vRealize Operations Manager v6.x architecture differs significantly from vCenter Operations 5.x. The biggest change comes in vRealize Operations v6’s new scale-out architecture. A vRealize Operations Manager implementation can now be scaled up to eight cluster nodes and 50 remote collectors in 6.0.x and 16 cluster nodes and 50 remote collectors in 6.1.x. This significantly improves on vCenter Operations, allowing an implementation to scale to support 120,000 objects and 30,000,000 metrics. The vRealize architecture also brings improved redundancy.
First, let’s take a look at what a vRealize Operations node now looks like. In Figure 1 below, we can see that a vRealize Operations node contains all of the functions needed to operate a vRealize environment: UI, Collectors, Controller, Analytics, and Persistence. This means that you can deploy a single node and have a fully functional vRealize environment. Preferably, you would opt to deploy more than one node for the purpose of redundancy and performance.
Figure 1 – A vRealize Operations node
The Persistence layer, a key component of the new architecture, allows for greater capacity and performance and also creates the option to scale out. Inside this layer, VMware has implemented an in-memory multi-node database called Pivotal Gemfire. In vRealize Operations 6.1, this function is now served by Apache Cassandra, which enables the expansion of the cluster size from the previous eight nodes in vROps 6.0 to 16 nodes. This makes understanding the capacity requirements of your vRealize environment critical. You must have sufficient memory to host the in-memory database. In a multi-node cluster, the architecture would look like Figure 2 below.
Figure 2 – The architecture of a multi-node vRealize Operations cluster
VMware’s Knowledge Base article, vRealize Operations Manager 6.1 Sizing Guidelines contains a sizing worksheet that is useful for planning your vRealize environment. If you have not yet upgraded to v6.1, we recommend doing so as soon as possible. This release contains a number of critical fixes as well as improved performance and capacity. VMware’s sizing guide for v6.1 can be found here.
Blue Medora’s updated version of the sizing worksheet includes Cisco UCS, Citrix XenDesktop, and NetApp management packs, and can be found here.
This worksheet is a great start for sizing a vRealize environment. The key table to understanding capacity is documented in the Knowledge Base article as well as the first tab of the sizing worksheet, as seen in Figure 3.
Figure 3 – The VMware vRealize environment sizing worksheet
vRealize Operations Manager requires that all nodes of the cluster to be the same size. With Apache Cassandra providing access to in-memory data across multiple nodes, we can gain performance and redundancy improvements by taking advantage of a multiple-node deployment.
To begin, open up the sizing spreadsheet and jump to the Advanced tab. Here we will enter the various resources we want to manage with vRealize Operations. Start by selecting your high availability (HA) and data retention settings. The defaults are HA DISABLED and six months data retention. Next, enter your estimates for your vCenter objects – virtual centers, datacenters, clusters, virtual machines, hosts, and datastores. Do your best to project at least six months of growth in your estimates.
Now enter the resources and metrics from any other management packs you plan to install. Use an estimate when you do not know a value, or attempt to ascertain the value from a native tool (console, management interface, etc).
Figure 5 – Estimated sizing for our environment’s Management Pack for NetApp Storage
Once you have completed the entries for all management packs you intend to deploy, the sizing worksheet will provide an analysis at the bottom of the advanced tab.
Figure 6 – Analysis provided in the vROps sizing guide
In our example worksheet, the indication is that a single medium or large node will handle our load. Keep in mind the Total # of Objects and Total # of Selected Metrics values and compare these against the first tab. Note that the capacity numbers on the Overall Scaling tab are architectural maximums and not goals. If our worksheet brought us closer to 6,700 objects we would want to seriously consider using a large node or – even better – two medium nodes.
Following these guidelines, you can rest assured that your vRealize Operations Manager cluster will have sufficient resources to store all of the metrics you intend to collect, as well as provide the real time analytics and visualizations which make vRealize Operations so valuable. Revisit this spreadsheet periodically to perform a ‘reality check,’ especially if you notice sluggish performance in vROps.