By Carl Olafson
vRealize Operations Manager v6.x is a completely redesigned operations management tool. From an architectural standpoint, vRealize Operations Manager is vastly superior to vCenter Operations Manager, which was a two-VM vApp, and could only scale up. As a starting point, vRealize Operations Manager v6.x uses Gemfire cluster technology, and as such can also scale-out for additional capacity. In addition, the Advanced and Enterprise editions allow vRealize Operations Manager High Availability to be enabled (not to be confused with vSphere HA) for fault tolerance. The remainder of this article will be broken down into some key concepts and architecture terminologies.
Cluster Technology and Scale-Up/Scale-Out Capacity
As mentioned, Gemfire is a cluster technology and for vRealize Operations Manager v6.0/v6.1, there is a node cluster limit of 8 in v6.0.x, and 16 in v6.1.x. This gives vRealize Operations Manager scale-out capacity of 8–16 nodes. In addition, each node/VM has scale-up capacity of 4 vCPUs/16 GB vRAM (small) and 16 vCPUs/48 GB vRAM (large). From a best practices standpoint this brings up a couple of items that must be adhered to:
- For a multi-node cluster, all nodes must be the same scale-up size (small, medium or large). Gemfire assumes all nodes are equal and distributes load across the cluster equally. Performance problems will occur if you have different sized nodes in your vRealize Operations Manager cluster. And you can adjust node size after the initial implementation as your environment grows.
- For a multi-node cluster, all nodes must have Layer 2 (L2) adjacency. Gemfire cluster technology is latency sensitive. From a VMware supportability standpoint, placing nodes in a cluster across a WAN or Metro Cluster is not supported.
- Proper sizing of the cluster and utilization of Remote Collectors is key to a successful implementation. The next article will cover this in detail.
For vRealize Operations Manager there are two primary types of nodes: cluster nodes and remote collectors.
The cluster nodes participate in the vRealize Operations Manager cluster. There are three distinct sub-types.
- Master node, which is the first node assigned to the cluster. The master node is also responsible for managing all the other nodes in the cluster.
- Data nodes, which would make up the remaining nodes of a non-HA cluster.
- Replica node, which is a backup to the master node should the master node fail. This assumes vRealize Operations Manager HA is enabled.
Examples of vRealize Operations Manager cluster architectures.
Remote collectors do not participate in the vRealize Operations Manager cluster analytic process. However, the remote collector is an important node when you have a multi-site implementation or are using specific management packs that cannot be assigned to a cluster node. The remote collector only contains the Admin UI and the REST API component that allows it to talk to the vRealize Operations Manager cluster.
Although your cluster is limited to 8–16 nodes (based on version) and determines your overall object collection capacity, you can have an additional 30–50 remote collectors: 30 in version 6.0, and 50 in version 6.1. The remote collector’s object count applies against the cluster, but does not diminish the size or number of cluster nodes. With the release of vRealize Operations Manager v6.1, remote collectors can also be clustered, and an emerging best practice is to move all management packs/adapters to clustered remote collectors. This helps reduce the load on the analytics cluster, and combined with remote collector clustering provides a higher level of fault tolerance and efficiency.
The remote collector is an important design consideration if you are using management packs (like MPSD) or have vCenters across a WAN/Metro Cluster. If your vRealize Operations Manager cluster is going to collect from multiple vCenters over a WAN or utilize management packs, consult a qualified SME on your design for cluster nodes, remote collectors and level of fault tolerance. VMware Professional Services (PSO) provides vRealize Operations services ranging from Architecture to Operational Transformation.
A load balancer is another important design consideration for a multi-node cluster. vRealize Operations Manager v6.x does not currently come with a load balancer, but can utilize any third-party stateful load balancer. Utilizing a load balancer ensures the cluster is properly balanced for performance of UI traffic. It also simplifies access for users. Instead of accessing each node individually the user only needs one URL to access the entire cluster and not be concerned with what node is available.
Carl Olafson is a VMware Technical Account Manager based out of California.