Technical

VMware Virtual SAN Alarms for vCenter Server with PowerCLI

VSANPowerCLIAlarmLogoI was recently involved in a couple customer conversations where the main topics were focused on monitoring and troubleshooting events in vCenter particularly for Virtual SAN.

I know that particular topic has been covered a few times in the past, not only on the VMware corporate storage blog but also by other community blogs. To be more specific, one of the VSAN Champions William Lam has covered this topic extensively on his personal blog.

The work that we have done on the topic of vCenter Server Alarms and Virtual SAN stems from the findings identified in two articles published by William. For more information on what are the recommended vCenter Server Alarms for Virtual SAN and how to add and configure them take a look at the articles listed below:

With vSphere 6.0 and Virtual SAN 6.0 nearing generally available very soon, this script can make things a lot easier for all Virtual SAN customers and provide a simplified way to get all the available vCenter Server alarms for Virtual SAN added and configured within seconds.

I got a chance to work on this little nugget with one of the world’s baddest PowerCLI gurus on the planet and also another VSAN Champion Alan Renouf and William Lam as well whom are members of the VMware virtualization team codename #TheWreckingCrew. Here is a PowerCLI sample code that can be utilized to add and configure all of the vCenter Server Alarms for Virtual SAN. These alarms are applicable to both Virtual SAN versions 5.5 as well as 6.0.

Here is the sample of the code, make sure to update the credential and cluster names to match the respective infrastructure requirements.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
$vc = Connect-VIServer -Server vcsa-vmware.local -Username [email protected] -Password vmware
#Define the VSANcluster where you would like the rules created
$Cluster = “vsan-cluster01”
$VSANAlerts = @{
                “esx.audit.vsan.clustering.enabled” = “Virtual SAN clustering service had been enabled”
                “esx.clear.vob.vsan.pdl.online” = “Virtual SAN device has come online.”
                “esx.clear.vsan.clustering.enabled” = “Virtual SAN clustering services have now been enabled.”
                “esx.clear.vsan.vsan.network.available” = “Virtual SAN now has at least one active network configuration.”
                “esx.clear.vsan.vsan.vmknic.ready” = “A previously reported vmknic now has a valid IP.”
                “esx.problem.vob.vsan.lsom.componentthreshold” = “Virtual SAN Node: Near node component count limit.”
                “esx.problem.vob.vsan.lsom.diskerror” = “Virtual SAN device is under permanent error.”
                “esx.problem.vob.vsan.lsom.diskgrouplimit” = “Failed to create a new disk group.”
                “esx.problem.vob.vsan.lsom.disklimit” = “Failed to add disk to disk group.”
                “esx.problem.vob.vsan.pdl.offline” = “Virtual SAN device has gone offline.”
                “esx.problem.vsan.clustering.disabled” = “Virtual SAN clustering services have been disabled.”
                “esx.problem.vsan.lsom.congestionthreshold” = “Virtual SAN device Memory/SSD congestion has changed.”
                “esx.problem.vsan.net.not.ready” = “A vmknic added to Virtual SAN network config doesn’t have valid IP.”
                “esx.problem.vsan.net.redundancy.lost” = “Virtual SAN doesn’t haven any redundancy in its network configuration.”
                “esx.problem.vsan.net.redundancy.reduced” = “Virtual SAN is operating on reduced network redundancy.”
                “esx.problem.vsan.no.network.connectivity” = “Virtual SAN doesn’t have any networking configuration for use.”
                “esx.audit.vsan.net.vnic.added” = “Virtual SAN NIC has been added.”
                “esx.audit.vsan.net.vnic.deleted” = “Virtual SAN NIC has been deleted.”
                “esx.problem.vob.vsan.dom.lsefixed” = “Virtual SAN detected and fixed a medium error on disk.”
                “esx.problem.vob.vsan.dom.nospaceduringresync” = “Resync encountered no space error.”
                “esx.problem.vsan.dom.init.failed.status” = “Virtual SAN Distributed Object Manager failed to initialize.”
                “esx.problem.vob.vsan.lsom.disklimit2” = “Failed to add disk to disk group in VSAN 6.0.”
                “vprob.vob.vsan.pdl.offline” = “Virtual SAN device has gone offline in VSAN 6.0.”
}
$alarmMgr = Get-View AlarmManager
$entity = Get-Cluster $Cluster | Get-View
$VSANAlerts.Keys | Foreach {
                $Name = $VSANAlerts.Get_Item($_)
                $Value = $_
                # Create the Alarm Spec
                $alarm = New-Object VMware.Vim.AlarmSpec
                $alarm.Name = $Name
                $alarm.Description = $Name
                $alarm.Enabled = $TRUE
                $expression = New-Object VMware.Vim.EventAlarmExpression
                $expression.EventType = $null
                $expression.eventTypeId = $Value
                $expression.objectType = “HostSystem”
                $expression.status = “red”
                $alarm.expression = New-Object VMware.Vim.OrAlarmExpression
                $alarm.expression.expression += $expression
                $alarm.setting = New-Object VMware.Vim.AlarmSetting
                $alarm.setting.reportingFrequency = 0
                $alarm.setting.toleranceRange = 0
                # Create the Alarm
                Write-Host “Creating Alarm on $Cluster for $Name”
                $CreatedAlarm = $alarmMgr.CreateAlarm($entity.MoRef, $alarm)
}
Write-Host “All Alarms Added to $Cluster”
Disconnect-VIServer -Server $vc -Confirm:$false

– Enjoy

For future updates on Virtual SAN (VSAN), vSphere Virtual Volumes (VVOLs) and other Software-defined Storage technologies as well as vSphere + OpenStack be sure to follow me on Twitter: @PunchingClouds