Architecture

Virtual Volumes and the SDDC

I saw a question the other day that asked “Can someone explain what the big deal is about Virtual Volumes?”   A fair question.

The shortest, easiest answer is that VVols offer per-VM management of storage that helps deliver a software defined datacenter.

That, however, is a pretty big statement that requires some unpacking.  Rawlinson has done a great job of showcasing Virtual Volumes already, and has talked about how it simplifies storage management, puts the VMs in charge of their own storage, and gives us more fine-grained control over VM storage.  I myself will also dive into some detail on the technical capabilities in the future, but first let’s take a broader look at why this really is an important shift in the way we do VM storage.

VVols enable a number of ways to do things better: Cloning and snapshotting that can be offloaded to the arrays, for example, or we can snapshot systems at an array level yet still on a per-VM basis.  And while these are important functions, they are evolutionary in nature, offering a much better way to do the same types of things we’ve done in the past.  The ‘big deal’ though is the shift in the way storage is being delivered that is best fully understood in the context of storage policy based management.  Once we move the question around a bit to talk about SPBM, it seems that SPBM really can only be understood fully in the context of software defined storage.  And what then is software defined storage except for an aspect of the software defined datacenter?

An SDDC is a very different way of doing things that has been emerging over the last few years.  Equipment in a datacenter provides capabilities, but the problem is that the demands on a datacenter are very broad: It must be built to serve every possible requirement, from minute to massive.  In order to do this with any sort of manageability, datacenters need to be built in large structures: A cluster of compute servers, a bunch of arrays to provide storage, network gear of particular sizes and types.  Into these broad structures we throw all the workloads with those diverse minute and massive requirements.

The problem is that these broad structures have specific capabilities that we are then tying our workloads to.  CPU instruction sets, MTU sizes, RAID levels and whatnot are defined by the structures, and then inherited by the workloads.  We build silos of equipment to provide different capabilities, and these silos are based on the features of the physical devices that constitute them.

Screen Shot 2015-04-13 at 6.25.13 PMIn a legacy storage model there will be individual elements of storage (a SAN, a JBOD, and so forth) that will each have a particular set of capabilities and must each be understood as discrete entities.  Workloads must be deployed in the particular location on the particular storage component on the particular LUN that will most closely line up with the requirements of the application.

Sometimes the storage components cannot deliver the required capabilities and the deployment of a new system becomes a compromise: One might reduce the requirements of the application, or increase the number of LUNs in the environment to meet the requirement, or deploy with a “close enough” mentality and place the service on whatever storage most closely matches its demands.

The SDDC is designed to fix this.  The infrastructure components still have unique capabilities, but in an SDDC the means by which the workloads consume those capabilities are shifted.  The services no longer are positioned to directly consume the physical gear, but communicate first with the features that are advertised by the physical gear and are abstracted into software-based definitions.

It is a pretty massive shift: It means that workloads can self-describe their requirements of the infrastructure and a software layer can understand those requirements and intelligently place a workload where it will receive the results it requires.

A few examples:

1)    From a CPU perspective VMware has been doing this for years; software-defining hardware clusters and with things like EVC abstracting away specific hardware details.  We place the workloads instead on a pool of compute resources.

2)    NSX offers software abstraction from the physical network, allowing capabilities to be created and consumed that don’t even exist on the physical network framework.  Software definition and abstraction from the physical network allows workloads to self-describe their network configuration, security, transport rules and have those policies follow the workloads around the network.

3)    Virtual Volumes allow individual of VMs to have a set of storage policies that will be satisfied by the arrays on a VM-by-VM basis.  It inverts the model whereby the storage array dictates to the VMs placed on it what their capabilities will be, as delivered by the LUN on which they reside.   Instead the VMs dictate to the storage array what its requirements are, and the array instantiates each individual component as its own unique entity with its own unique policies.VVOL-Enabled Array

So how does a VVol fit into this broader picture?  An SDDC requires software abstraction of the physical components, and software defined storage is one major component of this vision alongside network and compute virtualization.  Software defined storage gives the ability to abstract and virtualize the storage plane of a datacenter, and align its consumption to the requirements defined by policies in a control plane.

SDS removes a lot of the compromises of traditional datacenter storage, by abstracting all these components and surfacing their capabilities to a storage policy based management engine in the control plane.

SPBM takes all the capabilities of the SDS data plane, and offers a consumption model that is built around profiles and policy based deployment.  The administrator can create different profiles based around the capabilities of the infrastructure, let’s call them Gold, Silver and Bronze.   Maybe instead it’s Mission critical, Tier 1, Tier 2.  Maybe it’s Fast, Medium, Slow.

It doesn’t particularly matter, the point is that the consumer of storage, the IT administrator or application owner or whoever is deploying the application doesn’t need to know a thing about what constitutes and delivers these service levels.  They simply deploy a new VM and attach the appropriate profile for the VM.

Screen Shot 2015-04-13 at 6.53.29 PM

“Gold” might require fully mirrored data that is not deduplicated and resides entirely on flash devices.  VMs with this policy may end up on a VVol enabled array.  “Silver” may require striping and mirroring and be compatible with an all-flash Virtual SAN.  “Bronze” may demand deduplication and compression and yet sit on SATA disk on the same VVol array as “Gold”.

With SPBM, when we deploy a VM, we simply select the appropriate service level.  The storage policy is the primary means of defining storage requirements and determining where a VM should reside.  The location of that residence becomes more or less unimportant: The important piece is that the service associated with the policy can be met, and any storage that can deliver on that policy (that is “compatible” with that profile) can be considered equivalent.

Screen Shot 2015-04-13 at 6.33.34 PM

Now behind the scenes is where the Virtual Volume gets very interesting to this model.

Given the above scenario, we may have a new VM that we are deploying with a “Gold” policy storage profile attached to it.

Traditionally if we wanted to deliver the capabilities of a gold level of service to a new VM we’d need to understand the entire storage stack from top to bottom: We would need to look around at all the potential storage targets that a given cluster sees, and all the LUNs that back our possible targets, and understand each capability of each possible target.  If we need to give a VM mirroring, deduplication, and flash device backing we need to evaluate and judge each potential target against these criteria.

SPBM automates that entire process to find just the targets that will satisfy our policy, and places the VM appropriately.  But beyond that, with Virtual Volumes, the VM is placed on that appropriate array as its own standalone entity with those policies defining where it will go and how its data will be instantiated on the array.

To again contrast this with a traditional LUN/filesystem model, we would previously have to pre-create every possible LUN to satisfy every possible capability.  We end up with dozens or more datastores and filesystems to address every possible need.  Just taking 3 options into account, around mirroring or striping, deduplication, and disk type we would need to provision dozens of unique LUNs to address every permutation of capabilities:

Screen Shot 2015-04-13 at 8.21.36 PM

When we deploy a VM we need to determine where the VM will go, among these locations, and then affix it to that spot.  Have other VMs to deploy with similar requirements? Well they will all go into the same location.  You may end up with 12 LUNs, 1 of which is completely full and the others of which are empty because you had to set up the other LUNs on a best-guess effort, or leave aside wasted space on the possibility of increased usage, etc.  Autotiering, storage DRS, and other ingenious methods are ways to alleviate the problem, but the model doesn’t fundamentally change.  The arrays remain unaware of the requirements associated with the VMs running on them.

With VVols you define profiles based on the array’s capabilities, rather than creating LUNs each with its own set of capabilities.  Are there still disk groups and RAID sets and all the rest, behind the scenes in the array?  Certainly, depending on the manufacturer’s choices for building their arrays and the storage administrator’s choices for configuration.  But the consumer of storage will never have to see these constructs and never have to be concerned with data placement.  These features have been abstracted and collected into pooled storage containers that represent the sum of all these capabilities and disks’ capacities. The administrator who is deploying the VM is simply choosing a policy for storage of the VM, and not engaging in discussions about these aspects of storage.  The decision making criteria are radically reduced, leading to simpler and service oriented deployments.

Screen Shot 2015-04-13 at 6.37.42 PM

A VVol based VM can then be deployed to a container on the array in accordance with its profile:  Since each VM is its own fully discrete object it can have these capabilities delivered to it on its own, fully driven by policy, without LUNs or filesystems or LUN sharing with other VMs or overprovisioning or any of the other aspects that go into traditional disk-location-based management of new workloads.

The storage policy determines the VM’s storage ‘features’, and the array instantiates the VM as it’s own standalone entity on the array, in essence as its own dedicated, well… virtual volume!

So, that’s the big deal about Virtual Volumes. They are a great way of implementing storage policies on a per-VM basis.  This is a highly efficient way of provisioning VMs, but more than that this allows for a new way of delivering VM storage all together, one that is aligned with storage policy based management.  Doing away with rigid storage silos in turn is a way of aligning software defined storage to per-application business requirements, and hence delivering data services to individual applications, rather than tying VMs to the properties of the physical infrastructure.

Virtual Volumes are a big deal because they offer a flexibility we just can’t easily get from traditional volumes, and that flexibility enables progress toward the software defined datacenter.