I had been out for a few days working on-site with customers, and when
I came back into the lab this morning everything seemed fine.  It
wasn’t until later in the day, when I finally launched the Virtual
Infrastructure (VI) Client and connected to VirtualCenter,
that I realized one of my ESX hosts had crashed.  VMware HA had stepped
in and automatically restarted all the VMs on the second host, and the
other engineers who were there when the failure happened said they
didn’t even notice the failover.  Had I not logged in to the VI Client,
I wouldn’t have even known that I had a host failure.  It was that


  1. That would have been a great ad for VMWare HA if the ESX node had been taken out by a power outage or disk failure…
    But I’m thinking… Hmm. how often does ESX crash then?

  2. I’ve only seen ESX crash due to a hardware failure, such as a bad CPU, failed RAM, or the like. In this particular case, that’s exactly what it was–a hardware failure. One of the CPUs failed. Now, after surgery, the box is back up and running and I’m hoping (fingers crossed) that I don’t have any further problems.

  3. At this time alarms can be set on a host, or a virtual machine. When are we going to be able to set an alarm on a cluster for events like this? How about a failed HBA path or NIC?

