posted

1 Comment

Here’s a behavior that a few people have questioned me about recently:

Why is PVSCSI splitting my large guest operating system IO’s into smaller blocks?

By default ESXi’s behavior is to pass IO’s from the guest operating system as large as 32MB with the LSI vSCSI adapter (as long as the actual guest operating system doesn’t have a small default transfer size: http://kb.vmware.com/kb/9645697).

However, the PVSCSI vSCSI adapter was designed to pass only 512KB IO’s, or smaller, so anything larger is intentionally split. This behavior is not configurable. During performance testing, since the larger IO’s are split sequentially, it didn’t make a large difference.

There is one last condition in which the physical adapter device driver can request that ESXi split the IO even further. If, for example, the driver is limited to 64KB IO’s, then it will tell ESXi to split any IO’s into 64KB blocks as a maximum instead of the 512KB default.

It’s important to note though that large IO’s can have a negative effect on performance.

Examples:

If the ESXi storage stack is splitting guest IO’s, it must wait for all the array acknowledgements before being able to report the final latency. In this case, this will drive up device average latencies. More info here: http://kb.vmware.com/kb/2036863

As well, some storage arrays do not handle large IO’s well and as a result, that becomes a latency-inducing bottleneck.

So large IO’s are not necessarily the promise land either and like most situations, it depends on your application and infrastructure.

There is an ESXi host wide setting that allows you to define the maximum IO size to be passed to the array. It’s known as “Disk.DiskMaxIOSize” More info here: http://kb.vmware.com/kb/1003469

So if you are seeing different IO’s sizes out of ESXi than you expect, you should check a few different layers: guest operating system transfer size, vSCSI driver, ESXi DiskMaxIOSize and the physical adapter driver.