I recently learnt about a built-in safety mechanism on the vSphere Storage Appliance when too many reboots of the host takes place. A VSA server will enter maintenance mode silently if it is rebooted 3 times in 15 minutes. If a member is in maintenance mode, then it does not join the cluster. If the cluster is in maintenance mode, then it does not provide storage. So if 2 VSA nodes end up in maintenance mode, then the cluster will go offline.
Right now, the only way to check this has occurred on the VSA is via the CLI. The WSCLI utility is installed on the vCenter server managing the VSA cluster. The easiest way to check is to use the getSvaServerInfo option to the WSCLI as follows:
cd "C:Program FilesVMwareInfrastructureVSA Managertools"
"C:Program FilesVMwareInfrastructurejrebinjava.exe" -jar WSCLI.jar <VSA IP Address> getSvaServerInfo
This should return an output similar to the following:
id = xxxxxx-xxxx-xxxx-xxxxxxxxx
name = localhost
maintenance mode = true
Domain name = localdom
Storage Cluster ID = yyyyyyyy-yyyy-yyyy-yyyyyyyyyy
No DNS server
Internal interface = 192.168.4.1/24
Management interface = A.B.C.D/22
Gateway = A.B.C.253
Storage pool 0:
ID = 00000000-0000-0000-0000-000000000000
Total storage = 19595264KB
Free storage = 0KB
Used storage = 19595264KB
Free storage = 0KB
Total storage = 19595264KB
Used storage = 19595264KB
If maintenance mode is set to true, you can use the exitMaintenanceMode to take the host out of this state.
"C:Program FilesVMwareInfrastructurejrebinjava.exe" -jar WSCLI.jar <VSA IP Address> exitMaintenanceMode
After typing this command, wait a minute or two before doing another getSvaServerInfo to verify that the member has indeed exited maintenance mode (maintenance mode = false). At this point, the datastores should now start syncing.
Of course, there may a number of other reasons why your cluster may be offline or a host entered maintenance mode. This post is to highlight the fact that if there are 3 successive reboots in 15 minutes, then the silent maintenance mode state is entered, and these WSCLI commands can get your VSA out of this state.
If in doubt, always reach out to your nearest VMware support representative.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage