From the Trenches

Large I/O block size operations show high latency on Windows 2008

Nathan Small
Nathan Small

When performing a task that will utilize large I/O block sizes in Windows 2008, performance charts in vCenter or ESXTOP display very high device latency or DAVG, even though the actual throughput is excellent.

Unlike previous Windows releases, Windows 2008 can issue much larger I/Os block sizes for certain operations, including file copy, but can also be reproduced with other applications as such as IOmeter with large I/O block size test parameters.

This issue can also be reproduced in a Linux VM with the ‘dd’ command and issuing commands with a large block size (ie: 1M).

The reason why high latency is observed is due to the fact that the large I/O block size needs to be split up into smaller I/O sizes in order to be transmitted to the storage device. When this I/O split occurs, the measure of latency is against the entire command, not the individual chunks, therefore we have to wait for all chunks to complete to the array before reporting back on the latency. This results in a false positive when high latency is observed to the array when in reality there is no performance problem whatsoever.

Note: Reducing the Disk.DiskMaxIOSize advanced setting as per KB article: Tuning ESX/ESXi for better storage performance by modifying the maximum I/O block size (1003469) in the ESX host will not improve the latency results since the Guest OS is the one issuing such a large I/O block size. The Windows registry can be altered to issue smaller I/O block sizes resulting in lower latency, however as stated earlier, this is merely a false positive.

While one may think that issuing a VM migration or VM deployment can be considered a large block operation, in reality the vmkernel issues I/Os block size of 64k, so this is significantly smaller what the above mentioned Guest OS will issue, thus no high latency is observed for this operation.

Nathan Small (Twitter handle: vSphereStorage) is a Senior Escalation Engineer on the storage team in Global Support Services, and has been employed with VMware since May 2005.


0 comments have been added so far

  1. s there any relation between SPLTCMD/s and GAVG / rd Or GAVG / wr latencies.
    I am confused as I see high SPLTCMD/s where latency is high for Guest reads or writes.

    Basically SPLTCMD/s is for multipathing ? or IO sizes or partition boundary conditions?
    does this affect latency per command? eg if GAVG / cmd = 300 ms and SPLTCMD / s = 30
    does this mean that latency per command is now 10 ms (for whatever MB /s throughput)

Leave a Reply

Your email address will not be published.