I recently came across an interesting issue where a customer wasn’t able to successfully PXE boot their HP DL380G7 servers using AutoDeploy. All attempts to PXE boot would result in a “connection timed out” error. They opened a support case with HP and verified they had the required updates installed, but despite this they continued to get “connection timed out” errors.
Long story short, when they figured things out they found that the problem was not with the HP DL380G7 servers, the firmware, or the NIC drivers as was initially suspected, but rather it was an issue with Spanning Tree Protocoal (STP) settings on the switch ports. What the customer discovered was that the timeout was occurring because PortFast had not been enabled on the switch ports. Once they enabled PortFast the PXE boot worked as expected.
After reading up on the Spanning Tree Protocol and how PortFast works what I learned is that when the ESXi host would power up and begin the PXE boot, the switch port had to go through a STP listening and learning state before transitioning into a forwarding state. This transitioning through the listening and learning states induced a delay that caused the PXE boot to timeout. What PortFast does is causes a switch port to enter the forwarding state immediately, bypassing the listening and learning states, and hence eliminates the delay and avoids the timeout.
In researching this I did a quick search of the VMware knowledge base portal and found KB1003804 which helped me understand a bit more about PortFast and why it’s a good idea to have it enabled, even when you are not PXE booting your vSphere hosts.
Follow me on twitter @VMwareESXi.