UPDATE: Now that Virtual SAN is out of beta and generally available, there is updated guidance around sizing. The information written below was written during the beta, and has now been superceded by the new guidance. Please read this blog post for the latest recommendations or directly download the GA edition of the Virtual SAN Design and Sizing Guide.
A common question that I have seen recently around Virtual SAN (VSAN) is how limits on number of components, disks, etc., translate into capacity and policy limits. I will attempt to cover some of the basics in this post. However for a deeper explanation on the various sizing and design considerations, please check out the updated Design & Sizing Guide which you can find in the VSAN beta community documents section in the POC Kit folder. If you are not yet signed up for the VSAN beta, why not? Click here to register. The beta community has a wealth of information, including documentation, hardware guidance and some great discussions with our R&D engineers. A great place to start if you just want to read up on Virtual SAN, or indeed, kick its proverbial tires.
Let’s begin with components. I have put a deeper description of objects and components in an earlier post here. However, in a nutshell, a virtual machine can have a policy which define stripe width and/or availability through mirroring. These stripes and replicas are made up of components. There is a maximum of 3000 components per host. This is an important consideration if you wish to use policies that has a high stripe width or a high failures to tolerate setting since each of these contribute towards component consumption. And each virtual machine deployed on VSAN with that policy will consume that many components. However you need to have a lot of VMs, with a large stripe width and a large failures to tolerate setting before getting close to this limit.
How many disks?
The next discussion concerns the number of disks that can be deployed in VSAN. Again, there are some limits around this that needs to be considered by someone designing a VSAN implementation. VSAN uses the concept of disk groups to act as a container for HDDs and SSDs. A disk group can contain only one SSD but since the November 2013 VSAN beta refresh, a disk group can now contain 7 HDDs (magnetic disks). A host can have a maximum of 5 disk groups. This means, using some simple math, that a single host in a VSAN cluster can have 5 SSDs and 35 HDDs. However, you need to ensure that your storage controller can manage that many disk drives and that is a conversation you need to have with your hardware vendor. You should also check the VSAN HCL for a list of supported storage controllers (this is still a work in progress and new controllers are being validated and tested all the time). Also remember, VSAN supports scale out. So you can start small, and build out larger environments over time, including hot-adding of disks to servers and the hot-adding of hosts to the VSAN cluster. Currently we support a maximum of 8 hosts in a cluster – some more simple math gives you a maximum of 40 SSDs and 280 HDDs in a fully configured VSAN cluster.
How much HDD capacity do I actually need?
Now that we know the disk limits, how much capacity do I actually need.
“FailuresToTolerate” policy setting plays an important role in this consideration. There is a direct relationship between the number of failures to tolerate and the number of replicas of a virtual machine’s storage. For example, if the number of failures to tolerate is set to 1 in the VM storage policy, then there is a single mirror of the VMDK created on local disks on another host. If the number of “FailuresToTolerate” (FTT) is set to two, then there are two replicas of the VMDK across the cluster. The following formula can assist in calculating how much HDD one needs:
How much HDD do I need = VMDK Size * (FTT + 1)
How much SSD capacity do I need?
There are some additional considerations when trying to figure out how much SSD you need for Virtual SAN. Internally at VMware we use a rule of thumb that SSD capacity in the VSAN cluster should be approximately 10% of HDD capacity. This rule of thumb is used to reflect the average working data set of an application running in a VM. Whilst not perfect, VMware feels this is adequate for ball-park sizing of SSD capacity.
So, as per the previous example, with a default policy value of “FailuresToTolerate” (FTT) set to 1, write cache will be mirrored since writes go to the SSD on both hosts before being de-staged to magnetic disks on those hosts. This means you need to consider increasing the amount of SSD allocated per virtual machine as the “FailuresToTolerate” policy setting increases.
The amount of SSD can be calculated using the following formula:
How much SSD do I need = (VMDK Size * 10%) * (FTT + 1)
However, additional considerations come into play when you wish to ensure that there is spare capacity to handle failures in the cluster and still have optimally configured virtual machines. VSAN does of course offer high availability for virtual machines through the use of policies and replicas. But what if you want hot-spares? This is something that has come up a lot in conversations, and the answer is that the whole cluster can act as the hot-spare. But you must provision for it. This means that should a failure occur, your virtual machines will still be available, but you can now have VSAN rebuild those components that were on the failed host or disk to have your virtual machine tolerate another future failure in the cluster.
To reiterate, this is optional since a failure will not impact your virtual machines if ‘FailuresToTolerate’ is set to at least 1. The largest failure in the cluster could be a complete host failure. Therefore you will need to ensure that there is enough free SSD & HDD available in the cluster to tolerate at least one host failure if you want your virtual machine storage objects to be rebuilt and come back into compliance after a failure occurs. If you have not followed recommendations and are using non-uniform host configuration with different disk capacities, you will need to ensure that there is enough capacity in the cluster to tolerate the a failure of the largest host.
So as you can see, there are a lot of things to consider when trying to get the design and sizing of a Virtual SAN just right. For a definitive guide, head over to the VSAN beta community and get the latest Design & Sizing Guide for more information.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage