posted

16 Comments

[Updated 4-Nov-2001 with Patch Information]

An issue has recently been discovered with iSCSI on ESXi 5.0 which is resulting in very slow boot times for the ESXi 5.0. In ESXi 5.0, the iSCSI boot-up code has been modified to retry 9 times (versus no retry in 4.x) when running iSCSI discovery and login for the configured targets.

Why did we introduce this change? The change was introduced to fix an issue related to the ESXi DHCP client not getting IP addresses for vmknics before the iSCSI discovery & scan code starts. This meant a rescan was needed post boot to discover iSCSI LUNs. We observed this behaviour a number of times in previous releases where the VMkernel port used for iSCSI was not plumbed up in time. The fix introduced in ESXi 5.0 added more iterations in the initial iSCSI code, so that it works even if the DHCP client is delayed in getting the vmknic addresses. 9 iterations were added on the assumption that each iteration would take ~10secs and hence a maximum wait time would be 90 secs.

Unfortunately we overlooked some iSCSI configurations (with multiple network portals, discovery addresses and targets) where some VMkernel ports bound to the iSCSI initiator would never be able to login to some targets when they are on different network segments. This is common where customers are using iSCSI but they have targets on two completely separate network fabrics, i.e. only some targets are accessible on each of the VMkernel ports. This also caused the iterations to kick-in, and thus this is the root cause of the long bootup problem.

[Updated] VMware have just published a Knowledge Base (KB) detailing the problem. You can read KB 2007108 here. We also have a patch as of November 2011 to address the issue. The KB article has information about where to download the patch and how to apply it to your host.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage