Are your partition offsets aligned? Be honest!

[Update: see below]

Rich Bocchinfuso of GotITSolutions recently was at VMworld. He’s been wondering about how the virtualization world attends to storage and I/O issues.

This [lack of familiarity] is understandable as the target audience for VMware had
traditionally been the server engineering team and/or developers and
not the storage engineers thus the probable lack of a detailed
understanding of storage interconnects.

With VMware looking for greater adoption rates in the corporate
production IT environment by leveraging new value propositions focused
on business continuity and disaster recovery and host of others,
Virtualized servers will demand high I/O performance characteristics
from both an transaction and bandwidth perspective. Storage farms will
grow, become more sophisticated and more attention will be paid to
integrating VMware technology with complex storage technologies such as
platform based replication (e.g. – EMC SRDF), snapshot technology (e.g. – EMC Timefinder) and emerging technologies like CDP (Continuos data protection).

A practical example of what I believe has been a lack of education
around storage and storage best practice can be proven through the fact
that I believe many VMware users are unaware partition offset
alignment. Offset alignment is a best practice that absolutely should
be followed, this is not a function or responsibility of VMware but it
is an often overlooked best practice – (engineers who grew up in the
UNIX world and are familiar with a command strings like
“sync;sync;sync” typically align partition offsets but admits who grew
up in the Windows world I find often overlook offset alignment unless
they are very savvy Exchange or SQL performance gurus). Windows users
have become accustomed to portioning using disk manager from which it
is not possible to align offsets, diskpar must be used to partition and align offsets.

I would be interested in some feedback on how many VMware / Windows
users did not do this during their VMware configuration of Windows VM
install? Be honest! If you are not using disk par to create partitions
and align offsets it means that we need to do a better job educating.

[Update: a commenter asked for more clarification of what in the world we’re talking about. The definitive guide is Recommendations for Aligning VMFS Partitions, and the short answer is that you’re fine as long as you’re using VMware Infrastructure 3’s VirtualCenter or the VI Client to create your VMFS partitions on your SAN. I blogged about this earlier and pointed to a nice article with some clarifying diagrams.]


0 comments have been added so far

  1. Umm… let’s just say you need to do a better job educating. No idea what any of this means, and I’ve installed dozens of ESX farms.

  2. With ESX3, I understand that partioning alignment has gone away. Are you suggesting this is not the case, and that with ESX 3 we still need to align our partitions?

  3. You still need to align the boot partition of the guest OS, along with any data partition. The alignment that is done with the VC gui is only the vmfs partition.

  4. The problem is due to problems inherent with the array itself and not that there are “63 sectors per track” (which is actually just nonsense). ESX Server could care less about CHS addressing (with the one exception of fdisk in the Console OS), and nobody has kept track of real geometries on drives for 10 or 15 years.
    PC disks (until GPT comes around) use the first 63 sectors of the disk to fill with things like the partition table and the master boot record. The first partition starts at LBA 63 (start enumerating from 0). Sectors on the disk are 512 bytes, so the starting position of the first partition is at 32256 bytes.
    Arrays like the Symmetrix and Clariion do not take PC architecture into account when creating LUNs. They work by striping data (in the case of the Symmetrix I believe it’s 64K) in chunks on each of the physical disks in the array. The array isn’t smart enough to special case the first 63 sectors of the LUN so you end up with a sector fragmented across two disks in the stripe.
    This generally wouldn’t be a problem, except that the caching mechanism on the array controller gets confused and creates extra cache hits (as I understand it) due to data not fitting properly into the confines of the raid stripe.
    By changing the starting point of the first primary partition to sector 128 (65536 bytes in), you can fool the caching mechanism and eek out a small percentage more of performance.

Leave a Reply

Your email address will not be published.