Home > Blogs > VMware Cloud Management

True Visibility with Custom Reports in vRealize Operations: Great Blog

I found a great blog from our friends at Blue Medora on custom reports in vRealize Operations. Definitely worth a read:

By Mike Langdon: Last month, VMware’s Cloud Management blog post showed you how to integrate discrete management packs with custom dashboards. Another key tool in the vRealize Operations utility belt is the reports system. While the core system and many management packs ship with ready-made reports, users can also create custom reports.

custom reports in vrealize operations


One way to exploit the vRealize Operations reports system is to create custom reports that meet the needs of stakeholders across your organization. Once designed, these reports can be automatically generated and delivered via email. In this use case, we will make a report for a database analyst (DBA) who wants a virtual and storage layer view of both database memory and disk. In this hypothetical environment we have installed two Blue Medora management packs:

Read the full blog here.

VMware Log Insight 2.5 design and deployment project: A Guest Blog

Recently, one of our bloggers, "E1" Rahabok let me know about a great blog from one of our Log Insight users, Manny Sidhu. Here as an excerpt and a link to the full text:

I have recently had the good fortune of working on a large scale deployment of VMware’s vRealize Log Insight 2.5. The project included the design, deployment and some administration of the product. I thought I’d blog about my experience given the paucity of blog posts pertaining to the design and deployment of Log Insight. For brevity, I’ll hereafter call the product – LI.

First off, kudos to VMware on continuing the development of this software (after its acquisition) – an awesome piece of code! Highlights:

  • Makes troubleshooting LOTS easier. Problem host? Search for it in Interactive Analytics, choose the search time period, look at logs, fix problem.
  • No need to generate log bundles, VMware GSS webex in, run through the problem with you, look at Log Insight, troubleshoot, problem solved. Well, sometimes not so straightforward, but you get my drift.
  • A variety of Content Packs allow you to look through the logs of say your Vblocks or your storage arrays and narrow down to potential causes of problems.
  • Email alerting to warn you of potential issues if something’s happening over and over in a certain time period.
  • Centralized logging for everything from your hosts and vCenter Servers to your arrays and network devices.
  • A slick and very responsive interface.

Need I say more?! I highly recommend people download a trial, throw it on their labs at work or home and see the log ingestion magic for yourself.

Read the full blog here.

Interested in Log Insight, try Lab 1401 below.

Log Insight HOL

Capacity Management in SDDC – Part 9

If you land into this Part 9 directly, I'd recommend that you at least review from Part 5 first. If you want to review from the conceptual stage, then go to Part 1. This is a series of blog post on Capacity Management in SDDC.

Network (all tiers)

To recap, we need to create the following:

  1. A line chart showing the maximum network drop packets in the physical DC.
  2. A line chart showing the maximum and average ESXi vmnic utilization in the physical DC.

I use physical data center, not virtual data center. Can you guess why?

It's easier to manage the network per physical data center. Unless your network is stretched, problems do not span across. Review this excellent article by Ivan, one of the networking industry authority.

The problem is how to choose ESXi from the same data center? It is possible for a physical data center to have multiple vCenter servers. On the other hand, it is also possible for vRealize Operations World object, or even a single vCenter, to span multiple physical data centers. So you need to manually determine the right object, so you get all the ESXi in that physical data center. For example, if you have 1 vRealize Operations managing 2 physical data centers, you definitely cannot use the World object. It will span across both data centers.

The screenshot below shows the super metric formula to get the maximum network drop packet at a vCenter data center object. Notice I use depth=3, as the data center object is 3 level above ESXi host object.

Drop Packet DC level

I did a preview of the super metric. As you can see above, it's a flat line of 0. That's what you should expect. No dropped packet at all from every host in your data center.

Dropped packet is much easier to track, as you expect 0 everywhere. Utilization is harder. If your ESXi has mix 10G and 1G vmnic, generally speaking you would expect the 10G to dominate the data. This is where consistent design & standard matter. Without it, you need to apply a different formula for different configuration of ESXi host.

Let's look at the Maximum first, then Average. As I shared in this blog, you want to ensure that not a single vmnic is saturated. This means you need to track it at the vmnic level, not ESXi host level. Tracking at the ESXi Host level, as shown in the following screenshot, can hide the data at vmnic level. Take an example. Your ESXi has 8 x 1 Gb NIC. You are seeing a throughput of 4 Gbps. At the ESXi host level, it's only 50% utilized. But that 4 Gbps is unlikely to be spread evenly. There is a potential at a vmnic is saturated, while others are hardly utilized.

ESXi vmnic utilization

As I shared in this blog, the super metric formula you need to copy-paste is

Max ([
Max(${this, metric=net:vmnic0|received_average}),
Max(${this, metric=net:vmnic0|transmitted_average}),
Max(${this, metric=net:vmnic1|received_average}),
Max(${this, metric=net:vmnic1|transmitted_average}),
Max(${this, metric=net:vmnic2|received_average}),
Max(${this, metric=net:vmnic2|transmitted_average}),
Max(${this, metric=net:vmnic3|received_average}),
Max(${this, metric=net:vmnic3|transmitted_average})
]) * 8 / 1024

The above is based on 4 vmnic per ESXi. If you have 2x 10 Gb, then you just need vmnic0 and vmnic1. If you have 6 vmnic, then you have to add vmnic4 and vmnic5.

The above will give you per ESXi host. You then need to apply it per physical data center.  Please review this blog post.

Ok, the above will get us the maximum. We then apply the same approach for average. The great thing about taking the average at individual vmnic is you do not have to worry about how many vmnics an ESXi host has. If you use the data at the ESXi Host level, as shown in the screenshot below, you need to divide the number by the number of vmnics.

ESXi vmnic utilization

Once you have the Maximum and Average, you want to ensure that the Maximum is not near your physical limit, and the Average is showing a healthy utilization. A number near the physical limit means you have a risk of capacity. A number with low utilization means you over provisioned the hardware.

BTW, there is 1 physical NIC that is not monitored in the above. Can you guess which one?

Yes, it's the iLO NIC. That does not show up as vmnic. Good thing is generally there is very little traffic there, and certainly no data traffic.

This entry concludes our series in Capacity Management in SDDC.

3 Reasons Why You Should Integrate Log Insight with vRealize Operations

In my last post, I demonstrated how easy it is to integrate Log Insight (LI) with vRealize Operations (vR Ops). In this post, I will talk about the value of integrating Log Insight with vR Ops.


Continue reading

Guided Remediation with vRealize Operations

Written by Chima Njaka, vRealize Operations Product Line Manager

Like many in the SF Bay Area, I’ve been caught up in the epic NBA Final championship series between the local Golden State Warriors and the Cleveland Cavaliers.  As of this writing, 3 exciting games of the best-of-seven series have been completed, with the first two going to overtime, and wins by both teams.  Although the Warriors struck first with an early win, the Cavs have responded by winning the last two games, putting together a smothering defense that has kept to Warriors to their lowest scoring percentage all season, while staying just enough in front with their own offense.

Guided Remediation by Andre and Steph.

Although it is just a game, there is something that we can learn from watching the dynamics of this championship. That is: when faced with obstacles to progress, you must make adjustments to remediate the situation.  We’ve seen how the Cavs adjusted to the Warriors after game 1.  How should the Warriors now adjust?  No doubt, anxious Warriors fans have all sorts of recommendations.  But it is ultimately the team’s coach who is responsible for wisely guiding them back to victory.

A Coach for Your Data Center: Guided Remediation

So, what if you had a coach to assist you with your data center operations? What would your expectations be? Certainly, you’d hope it to draw on it’s experience, analytics capabilities, and insight to guide you in optimally executing your day to day plans, and respond to any issues, with clear actions to change things for the better.

When you think about it, this sounds exactly like VMware’s vRealize Operations Manager! Not only does it offer best-in-class visibility into your heterogeneous virtual environment, including immediate health issues, potential capacity shortfalls, and opportunities to optimize resources, but it also takes things to the next level — guiding you to directly remediate identified operational issues from the same interface. This takes advantage of the new Actions Framework introduced with vROPs 6.0. In fact, as an expert “player” in your own environment, you can create your own Symptoms, Alerts, Recommendations, and associated Remediation Actions.

In a way, the Actions Framework and the Actions it provides for you are like your teammates on the court, ready to execute the coaches (and your) plan. So, team up with vRealize Operations Manager and let its guided remediation capabilities help you tame your daily operational issues!

Oh yeah, and Go Warriors!



Capacity Management in SDDC - Part 7

In Part 5, I explained a new concept, where we use Contention as the basis of Capacity Management in SDDC. In Part 6, I provided the super metric equation for each charts. In this part, I will provide example of the super metric formula and dashboard screenshots.

Tier 1 (Highest)

To recap, we need to create line charts showing the following:

  1. The total number of vCPU left in the cluster.
  2. The total number of vRAM left in the cluster.
  3. Total number of VM left in the cluster.
  4. The maximum and average storage latency experience by any VM in the cluster
  5. Disk capacity left in the datastore cluster.

The screenshot below shows the super metric formula to get the total number of vCPU left in the cluster.

Tier 1 - No of vCPU left in a cluster after HA

Copy-paste the formula below:

${this, metric=cpu|alloc|actual.capacity} *

( ${this, metric=summary|number_running_hosts} - 1 ) /
${this, metric=summary|number_running_hosts}
- ${this, metric=summary|number_running_vcpus}

In logic, the formula is Supply - Demand, where:

  • Supply = No of Physical Cores in Cluster x ((No of Hosts - 1) / No of Hosts)
  • Demand = No of running vCPU in cluster

I have to assume there is 1 HA host in the cluster. If you have 2, replace 1 with 2 in the formula above.

I have to calculate the supply manually as vRealize Operations does not have a metric for No of Hosts - HA. Actually, it does, but the metric cannot be enabled.

If you find the formula complex, you can actually split them into 2 super metrics first. Work out Supply, then work out Demand. Let me use the RAM as example.

The screenshot below shows the super metric formula to get the total RAM supply. It is the total RAM in the cluster, after we take into account HA. I have to divide the number by 1024, then again by 2014, to convert from KB to GB.

Notice I always preview it. It's important to build the habit of always verifying that your formula is correct.

Tier 1 - Total physical RAM capacity in a cluster after HA

Once the Supply side is done, I worked on the Demand side. The following screenshot shows the demand.

Tier 1 - Total VM vRAM configured in a cluster

Once I verified that both are correct, it's a matter of combining them together.

Tier 1 - Total vRAM left in a cluster after HA

You can copy paste the formula below:

${this, metric=mem|alloc|actual.capacity} /1024 /1024 *
${this, metric=summary|number_running_hosts} - 1 ) /
${this, metric=summary|number_running_hosts} )
) -

Sum (${adapterkind=VMWARE, resourcekind=VirtualMachine, attribute=mem|guest_provisioned, depth=2}) /

The screenshot below shows the super metric formula to get the total number of VM left in the cluster. I have to hardcode the maximum number that I allowed.

No of VM left in the cluster

The screenshot below shows the super metric formula to get the Maximum latency of all the VMs in the cluster. I've chosen at Virtual Disk level, so it does not matter whether it is VMFS, VMFS, NFS or VSAN.

To create the Average latency super metric, you just need to replace the string Max with Avg in the formula.

super metric - vDisk

You can copy paste the formula below:

Max ( ${adapterkind=VMWARE, resourcekind=VirtualMachine, attribute=virtualDisk|totalLatency, depth=2 } )

The screenshot below shows the super metric formula to get the total number of disk capacity left in the cluster. This is based on Thin Provisioning consumption.

You can copy paste the formula below:

sum( ${adapterkind=VMWARE, resourcekind=Datastore, attribute=capacity|available_space, depth=1} )

For Thick Provision, use the following super metric:

super metric - Disk - space left in datastore cluster - thick

You can copy paste the formula below:

${adapterkind=VMWARE, resourcekind=Datastore, attribute=capacity|total_capacity, depth=1}
) -
${adapterkind=VMWARE, resourcekind=Datastore, attribute=capacity|consumer_provisioned, depth=1}

Last but not least, do not forget to include buffer for snapshot. This can be 20%, depending on your environment.

I hope you find the article useful for Capacity Management in SDDC. In part 8 (scheduled later this month), I will cover the super metrics for Tier 2 & 3, and for Network.

Using vRealize Operations Manager to Monitor the Cluster Nodes

In the past, I have written about the Architecture of vRealize Operations Manager which allows you to have Master and Replica nodes in a vROps Cluster. This not only allows you to distribute the adapters or solutions to more than one collectors present on each host, but also gives you resiliency in case the Master Node in the cluster fails.
With this post, I will actually share a failure which has been seen in my lab because of a couple of services failing on the master node. This resulted in a fail-over and the Replica switched over to become the Master Server. All this is possible because just like the previous releases, each and every service on your vROps nodes in being monitored by vROps itself.
Let me give you a brief description of my lab before I begin:
As you can see in the screenshot from my lab, I have a Master Node and a Replica Node. Since the past few days, when I tried accessing my vROps product UI through the Master Node IP address, it gave a Page Cannot Be Displayed. I immediately switched over to the other IP to see if I was able to access. As per the product behaviour, I can access the solution through any node of the cluster and I was able to achieve that without issues.
Today, I thought of looking at the issue with my Master Node and all I had to do was to click on a RED box on the recommendations page. Let me share that through a screenshot:
You can see that I have a red object on the Heat Weather Map and if I look down, I can immediately see an Alert for the the Master Node about services being down and 2 recommendations. Let us see what are the recommendations by clicking on this Alert:
Here, you can see that the Node Processing and Collector services are down and hence we are getting 2 recommendations to resolve this issue. One is to take the node offline and then bring it back online. The other option is to visit VMware Support. We could have also reached this screen, or could have directly jumped on a screen which monitors the entire vROps cluster by clicking on:
Home -> Environment -> vRealize Operations Cluster 
Here we can expand the cluster and see all the nodes and services associated with each node. Let us see this in a screenshot:
Here you can look at all the services and their health individually. Instead of going through these services, we will try to follow the recommendation given by the tool to make the affected node offline and then back online. Let us go into the cluster management and see the current state of the cluster:
Click on Home -> Administration -> Cluster Management 
Now you can see that the Replica Node has become a Master Node and vice-versa. We can select the VROPS-M node here and bring it offline.
As soon as I tried to take the host offline, I got an error that the Operation has failed and I should contact VMware support. Since, it's just a lab I will go ahead and restart the VROPS-M node from the vCenter Server and after a few minutes I was able to login to the Master Node IP address. Once I login into the Product UI, I can see that the resources have been distributed between both the clusters and the data gathering has started to work again. One thing to notice is that, after a failure the Master and Replica have switched over roles.
Voila, the issue seems to be fixed and I have all the nodes in a working state with objects and Metrics both shared between the nodes for a faster collection. If you go to the main recommendations page from where it all started, you would notice that that box will now turn green and you are good to go as the entire solution is up & running.
Hope this helps you in configuring your vROps clusters with confidence and a complete understanding of how the clusters are monitored and fail-over process works.

ACI Specialty Benefits: vSOM Customer Spotlight

We are proud to shine the spotlight this week on a VMware customer success story. Here’s proof that even smaller organizations can benefit greatly by upgrading from a naked vSphere environment to vSphere with Operations Management. This is especially true for a company like ACI Specialty Benefits in San Diego, an employee and student benefits company that is growing exponentially and needed its IT infrastructure to proactively support the business.

The ACI Specialty Benefits Story

Facing hyper-growth, ACI Specialty Benefits needed to ensure the company’s infrastructure was poised to proactively support the business. ACI moved to a virtualized data center and deployed vSphere with Operations Management in order to effectively load test, ensure adequate resources to handle demand, and onboard new customers more quickly and effectively. The company can now run applications at high service levels and maximize hardware savings through 40% higher capacity utilization and 50% higher consolidation ratios.

With the added visibility throughout its IT infrastructure, ACI has seen a 25% decrease in time spent on diagnostics and problem resolution. As a result of the operational efficiencies, the IT team is freed up to take on additional strategic initiatives.

“The first time looking at that single pane of glass was very surprising for us to see all of our different VMware infrastructure – at the data center, at headquarters, everywhere. Being able to see where we stood was a game changer.”

— Ryan Fay, Chief Information Officer, ACI Specialty Benefits

To hear the folks at ACI Specialty Benefits talk about how they use vSphere with Operations Management in their own words, check out the video, or download the info graphic here.

About ACI Specialty Benefits

Covering over seven million people, ACI Specialty Benefits is one of the nation’s top ten providers of Employee Assistance Programs. The company also offers premiere Workplace Wellness, Concierge, and Student Assistance programs. With a 95% customer retention rate, ACI is known for anticipating needs, exceeding expectations and providing customers with an unparalleled professional partnership.

Try the vSphere Optimization Assessment, and see what ACI sees

The vSphere Optimization Assessment (VOA) is a simple and powerful health check for today's virtual environments and addresses a number of key challenges for IT organizations. This free assessment is now available to customers on vmware.com as a 30-day program designed to enable them to:

  • Download and install the trial software of vRealize Operations – the predictive analytics engine in vSphere with Operations Management (vSOM)
  • Access 4 VOA Reports from within the installed product
  • Get advice from trained sales professionals

Download VOA


Management Pack for Storage Devices (now with VSAN): Beta

We're pleased to announce signups for the beta program for the  vRealize Operations Management Pack for Storage Devices (MPSD), as reported on the VMware Storage Blog. This iterations of the program includes features to manage the latest version of Virtual SAN (VSAN). We're looking for folks who'd like to test out the latest iteration of our MPSD. The 6.0 version can be viewed here.

If you'd like to join this beta, sign up herehttp://eepurl.com/bcln3r

About the  vRealize Operations Management Pack for Storage Devices (MPSD)

The vRealize Operations Management Pack for Storage Devices 6.0.1 provides visibility into your storage environment. Using Common Protocols you can collect performance and health data from the storage devices. Pre-defined dashboards allow you to follow the path from a VM to the storage volume and identify any problem that may exist along that path.

• End to End view of the data path through the SAN and NAS; from VM to Storage Volume
• Support for both NFS/iSCSI and FC/FCoE protocols
• Access to Storage devices leveraging standardized protocols; CIM, SMI-S, & VASA
• Ready to use dashboards for Health and Performance
• Analytics for common All Paths Down and PDL storage conditions
• This release has Beta support for VirtualSAN. (Sign up here  http://eepurl.com/bcln3r)

Requirements for the MPSD Beta

In order to participate in this beta, you need 1) a working knowledge of vRealize Operations 2) an installation, or a willingness to install, Virtual SAN. 3) a non-trivial amount of storage to test this against.

About vRealize Operations Management

vRealize Operations Management is a cloud operations management system that delivers intelligent IT operations management from applications to storage--for vSphere, Hyper-V, Amazon and physical hardware--with predictive analytics and policy-based automation. it built on a scale-out, resilient platform designed to deliver intelligent operational insights to simplify and automate management of applications and infrastructure across virtual, physical and cloud environments. For more information, see: https://www.vmware.com/products/vrealize-operations .

About Virtual SAN

VMware Virtual SAN is software-defined storage for VMware vSphere. By clustering server-attached hard disks and/or solid state drives (HDDs and/or SSDs), Virtual SAN creates a flash-optimized, highly resilient shared datastore designed for virtual environments.

Download VOA

IPAM Automation for Cloud

by Rich Bourdeau

vCAC-InfoBlox Logo

Many companies use InfoBlox IP address Management (IPAM) to manage their IP addresses and DNS host records.  InfoBlox has recently updated their  vRealize Automation plug-in that allows IP Addresses and DNS configurations  to be automatically assigned  as part of the automated provisioning a new machine or application.    This integration has simplified and accelerated the end to end provisioning, and lifecycle management of both infrastructure and applications

What You Will Learn

  • Discover how VMware is the foundation for the Software Defined Enterprise
  • Learn how the consolidated management of VMware automates deployment of secure, scalable, high performing multi-tier application
  • Examine how Infoblox VMware adapter can integrate and  with VMware automation and workflows to augment IPAM and DNS services while providing greater visibility of networking resources in your cloud environment
  • Watch a demonstration of Infoblox and VMware vRealize Automation to deploy servers in a cloud environment

Date: Wednesday June 17, 2015
Time: 11am PDT (2pm EDT)

Register Now

If you are unable to attend, 
click here to register for access to the archived recording.

Need help deploying your private cloud infrastructure or developing your business justification? Contact us and our experts can help your team build the business case and the solution that will maximize your IT productivity.

For exclusive content and updates, follow us on Twitter @vmwarecloudauto and subscribe to our VMware IT Management blog.