In my last post I gave a quick overview of how stateless caching works. In this post I want to begin looking at the role stateless caching plays in protecting against various outage scenarios.
When evaluating the benefits of stateless caching it’s good to understand the possible failure scenarios that may limit a host’s ability to PXE boot. For example, there could be an outage with the underlying network infrastructure, one of the PXE boot components, the Auto Deploy server, or even the vCenter server, and the benefit of using stateless caching is different for each. Let’s start by looking at the case of a problem with the underlying network infrastructure that causes an auto-deployed host to become isolated from the network and therefore renders the host unable to boot.
Network Isolated Host
An isolated host is a server that has lost network connectivity. Because an Auto Deploy host relies on the network to PXE boot, if it ever becomes isolated it will not be able to boot. The screen shot below shows an example of an isolated host that has failed to PXE boot.
With stateless caching enabled, if a host becomes isolated causing the PXE boot to fail the host is able to fallback to booting from the cached image that was saved to the disk during the last successful PXE boot. This is done by setting the host’s BIOS settings so that it will fall back to booting from the local disk should a network boot ever fail. The image below shows an example of a host that has failed to network boot and has fallen back to booting from the disk.
It is important to note that simply booting an isolated off the cached disk image will not fix the network outage. While the host may boot, the host is still isolated on the network and as such will not have network connectivity. As such stateless caching is not a solution to protect against outages in the network infrastructure. The benefit of stateless caching when it comes to network isolated host is with helping to troubleshoot the outage and expediting problem diagnosis. Enabling a network isolated host to boot enables the administrator to access the host’s console where he/she can troubleshoot the network connectivity. The screen shot below shows an example of using the ping and traceroute commands from within the ESXi shell of a network-isolated host configured for stateless caching.
When considering the benefit of stateless caching as related to a network isolated hosts keep the following points in mind:
- Stateless caching by itself does not protect against network isolation. However, it can help facilitate troubleshooting which may expedite problem diagnosis and resolution.
- Do not rely on stateless caching to protect against network outages. To protecting against having a vSphere host becoming network isolated ensure you have adequate redundancy implemented throughout your network infrastructure.
Note: Auto Deploy stateless caching does not protect against network outages that may cause a vSphere host to become isolated. However, it does allow an isolated host to be booted enabling the admin to access the host console which can help with troubleshooting.
In my next post I’ll look at how stateless caching works in the face of a PXE infrastructure outage…
For notification on future posts follow me on twitter @VMwareESXi