Are your partition offsets aligned? Be honest!
[Update: see below]
Rich Bocchinfuso of GotITSolutions recently was at VMworld. He's been wondering about how the virtualization world attends to storage and I/O issues.
This [lack of familiarity] is understandable as the target audience for VMware had traditionally been the server engineering team and/or developers and not the storage engineers thus the probable lack of a detailed understanding of storage interconnects.
With VMware looking for greater adoption rates in the corporate production IT environment by leveraging new value propositions focused on business continuity and disaster recovery and host of others, Virtualized servers will demand high I/O performance characteristics from both an transaction and bandwidth perspective. Storage farms will grow, become more sophisticated and more attention will be paid to integrating VMware technology with complex storage technologies such as platform based replication (e.g. - EMC SRDF), snapshot technology (e.g. - EMC Timefinder) and emerging technologies like CDP (Continuos data protection).
A practical example of what I believe has been a lack of education around storage and storage best practice can be proven through the fact that I believe many VMware users are unaware partition offset alignment. Offset alignment is a best practice that absolutely should be followed, this is not a function or responsibility of VMware but it is an often overlooked best practice - (engineers who grew up in the UNIX world and are familiar with a command strings like “sync;sync;sync” typically align partition offsets but admits who grew up in the Windows world I find often overlook offset alignment unless they are very savvy Exchange or SQL performance gurus). Windows users have become accustomed to portioning using disk manager from which it is not possible to align offsets, diskpar must be used to partition and align offsets.
I would be interested in some feedback on how many VMware / Windows users did not do this during their VMware configuration of Windows VM install? Be honest! If you are not using disk par to create partitions and align offsets it means that we need to do a better job educating.
[Update: a commenter asked for more clarification of what in the world we're talking about. The definitive guide is Recommendations for Aligning VMFS Partitions, and the short answer is that you're fine as long as you're using VMware Infrastructure 3's VirtualCenter or the VI Client to create your VMFS partitions on your SAN. I blogged about this earlier and pointed to a nice article with some clarifying diagrams.]

Umm... let's just say you need to do a better job educating. No idea what any of this means, and I've installed dozens of ESX farms.
Posted by: Lock | December 09, 2006 at 09:35 AM
With ESX3, I understand that partioning alignment has gone away. Are you suggesting this is not the case, and that with ESX 3 we still need to align our partitions?
Posted by: Rob | December 09, 2006 at 04:39 PM
You still need to align the boot partition of the guest OS, along with any data partition. The alignment that is done with the VC gui is only the vmfs partition.
Posted by: Ryan | December 11, 2006 at 01:55 PM
The problem is due to problems inherent with the array itself and not that there are "63 sectors per track" (which is actually just nonsense). ESX Server could care less about CHS addressing (with the one exception of fdisk in the Console OS), and nobody has kept track of real geometries on drives for 10 or 15 years.
PC disks (until GPT comes around) use the first 63 sectors of the disk to fill with things like the partition table and the master boot record. The first partition starts at LBA 63 (start enumerating from 0). Sectors on the disk are 512 bytes, so the starting position of the first partition is at 32256 bytes.
Arrays like the Symmetrix and Clariion do not take PC architecture into account when creating LUNs. They work by striping data (in the case of the Symmetrix I believe it's 64K) in chunks on each of the physical disks in the array. The array isn't smart enough to special case the first 63 sectors of the LUN so you end up with a sector fragmented across two disks in the stripe.
This generally wouldn't be a problem, except that the caching mechanism on the array controller gets confused and creates extra cache hits (as I understand it) due to data not fitting properly into the confines of the raid stripe.
By changing the starting point of the first primary partition to sector 128 (65536 bytes in), you can fool the caching mechanism and eek out a small percentage more of performance.
Posted by: Patrick | December 14, 2006 at 06:08 PM