VMware Virtual SAN has received amazing response from the virtualization community. Now as more and more customers are completing the acquisition and implementation processes, we are receiving more requests for operational guidance. Day 2 operations is perhaps my favorite topic to explore. Essentially the questions asked can be summed up as “Ok, I have done the research, proved the concept, and now have this great new product. Help me know the recommended practices to monitor, manage, and troubleshoot the inevitable issues that pop up with any software”. This question is the driver behind our new blog series, Operationalizing VMware Virtual SAN.
In this series, our aim is to take your most frequently asked questions around Virtual SAN Operations and provide detailed recommendations and guidance. In our first article in this series we look to answer the question “How do I configure vCenter Alarms for Virtual SAN?”
Virtual SAN vCenter Alarms
As you may have noticed, vCenter does not yet contain prepopulated alarms for VMware Virtual SAN. This is something that we are indeed looking into. In the interim, there is no need to forego configuring alarms for Virtual SAN. By leveraging the VMware ESXi Observer Log (VOBD) we can quickly and easily create our own vCenter Alert mechanisms for our Virtual SAN implementations.
Much appreciation to William Lam for getting the ball rolling on this topic. You should check out his article as well if you have not had opportunity already: “Handy VSAN VOBs for Creating vCenter Alarms“
For more information on log files for VMware products, check out these two links:
- Location of log files for VMware products (1021806)
- Location of ESXi 5.1 and 5.5 log files (2032076)
VMware ESXi Observation Log (VOBD)
The VMware ESXi Observation Log (VOBD) contains system events observed by the VMkernel. These events are termed “observations” and can a great assistance in troubleshooting . Located on each vSphere host in the /var/log/ directory and available via a webbrowser at https://[YOURESXhostIP]/host/vobd.log, the vobd.log file provides information for troubleshooting and detecting:
- Failed login attempts
- Network issues
- Performance issues
- Virtual SAN
- and much, much more
To begin creating our own Virtual SAN alarms within vCenter we could explore the VMware ESXi Observation Log file (/var/log/vobd.log), and identify the VMware ESXi Observation ID (VOB ID) for the specific Virtual SAN event that we are looking to alert on. VMware ESXi Observeration IDs (VOB IDs) will appear as the pattern vob.component.event. After identifying the relative IDs, we can then create vCenter Alarms to alert us when each specific VMware ESXi Observer ID (VOB ID) has been detected.
VMware ESXi Observation IDs (VOB IDs) for Virtual SAN
To expedite the configuration process, we have compiled a list of VMware ESXi Observation IDs (VOB IDs) for Virtual SAN along with a description for each. You may still wish to become familiar with the vobd.log to identify any other Observation IDs that you might like to alert on for other purposes.
VMkernel Observation ID | Descriptions |
esx.audit.vsan.clustering.enabled | Virtual SAN clustering service had been enabled |
esx.clear.vob.vsan.pdl.online | Virtual SAN device has come online. |
esx.clear.vsan.clustering.enabled | Virtual SAN clustering services have now been enabled. |
esx.clear.vsan.vsan.network.available | Virtual SAN now has at least one active network configuration. |
esx.clear.vsan.vsan.vmknic.ready | A previously reported vmknic now has a valid IP. |
esx.problem.vob.vsan.lsom.componentthreshold | Virtual SAN Node: Near node component count limit. |
esx.problem.vob.vsan.lsom.diskerror | Virtual SAN device is under permanent error. |
esx.problem.vob.vsan.lsom.diskgrouplimit | Failed to create a new disk group. |
esx.problem.vob.vsan.lsom.disklimit | Failed to add disk to disk group. |
esx.problem.vob.vsan.pdl.offline | Virtual SAN device has gone offline. |
esx.problem.vsan.clustering.disabled | Virtual SAN clustering services have been disabled. |
esx.problem.vsan.lsom.congestionthreshold | Virtual SAN device Memory/SSD congestion has changed. |
esx.problem.vsan.net.not.ready | A vmknic added to Virtual SAN network configuration doesn’t have valid IP. Network is not ready. |
esx.problem.vsan.net.redundancy.lost | Virtual SAN doesn’t haven any redundancy in its network configuration. |
esx.problem.vsan.net.redundancy.reduced | Virtual SAN is operating on reduced network redundancy. |
esx.problem.vsan.no.network.connectivity | Virtual SAN doesn’t have any networking configuration for use. |
esx.problem.vsan.vmknic.not.ready | A vmknic added to Virtual SAN network configuration doesn’t have valid IP. It will not be in us |
Creating Virtual SAN vCenter Server Alarms
Now that we have identified the list of VMware ESXi Observeration IDs (VOB IDs) we would like to alert on we simply create the corresponding vCenter Alarms.
1. Login to the vSphere Web Client
2. Navigate to the Hosts and Clusters view and select the vCenter Server object at the top of the tree.
3. Go to the Manage tab and select Alarm Definitions, then click on the (+) sign to create a new vCenter Server Alarm.
4. Provide a name for the Alarm (e.g Virtual SAN Disk Error), choose Hosts as the object to monitor, and select the “specific event occurring on this object” option. Click next.
5. Add a VMkernel Observation ID for the particular event to monitor (e.g . esx.problem.vob.vsan.lsom.diskerror to monitor disks errors). Leave the conditions argument blank. Click Next.
6. Specify a particular action to take when the alarm is triggered (e.g. send a notification email) and choose the frequency for the action to be repeated. Click Finish.
Note: In order to utilize and customize the actions frequencies for the send notification email action, vCenter Server mail settings have to be configured in advanced.
In the next article in our “Operationalizing VMware Virtual SAN” blog series we will automate the manual workflow to expedite the configuration of vCenter alarms for multiple VMware ESXi Observeration IDs. We will also cover automation options for backing up these custom alarms, reporting on alarm configurations, as well as displaying a quick and easy migration of alarm configurations from one vCenter to another.
Many thanks to William Lam (@vGhetto), Christian Dickmann (@cdickmann), Rawlinson Rivera (@PunchingClouds), and Ken Werneburg (@vmKen) for their much appreciated interest and contribution to this series.