The Software-Defined Storage Platform of the Future

Next week, at VMworld USA 2015, we will be announcing Virtual SAN 6.1, the third-generation of VMware’s hypervisor-converged storage for VMs. With this release we are introducing several major enhancements and new capabilities, including support for the latest generation of storage devices, new availability and disaster recovery features, and advanced management and serviceability tools. These capabilities build on top of already proven, best-of-breed performance, scalability and reliability features to provide enterprise-class storage for all virtualized workloads, including Tier-1 production, mission-critical applications, and high availability demanding use cases.

For more information about the latest Virtual SAN features keep an eye on the announcements coming out at VMworld next week.

In this article, I want to take a step back and reflect on where we stand with the product today and what is our vision for the future. Caveat: the forward-looking statements in this article do not reflect committed VMware products or features.

As we discussed multiple times in the past, the main goal for the Virtual SAN product, as it is shipping today, is to provide a cost-effective, enterprise-grade storage solution for typical virtualized environments. (The big success of the product so far, with 2000+ customers with production deployments is a testament to the product’s strengths.) How do those “typical” environments look? They are designed around traditional “monolithic” applications. These are single-binary apps that are designed to do certain tasks and are self-contained in terms of the utilization of libraries and OS features for data access and networking. They are typically designed to run on stand-alone server platforms and, with the exception of a few clustered applications from the likes of Microsoft and Oracle, they are not designed as distributed, fault-tolerant software services.

VMware has been extremely successful to a big extent thanks to vSphere features such as vMotion, DRS, HA, FT and various data protection solutions. They are used to meet the business continuity needs of an IT organization with traditional applications, which do not have such features natively. These features are designed around the notion of a manually managed pool of compute resources, the vSphere Cluster. A shared storage backend is a prerequisite to enable these features and management workflows. That’s exactly what VSAN does using a hyper-converged architecture–it aggregates the local storage devices of all the hosts in a cluster and makes them appear as a shared data store, accessible by all hosts in the cluster.

VSAN: Storage Infrastructure Management (today)

Given the primary use cases for VSAN today, we made some packaging decisions for the current product. First and foremost, we decided to make the VSAN cluster to be the same as the vSphere cluster. This is not an inherent technology property, but makes management and integration with vSphere much easier.

The user does not have to configure and manage storage clusters vs. compute clusters and then deal with all the complexity of what host accesses what VSAN datastore. It is exactly SAN management complexity such as zoning and fencing we are tackling with VSAN after all.
It facilitates seamless integration with important management workflows like Upgrades, Maintenance Mode, HA, over-provisioning for emergencies, etc. Even basic tasks such as automatically claiming disks have simpler semantics.
All existing vSphere APIs and management workflows just work! We just extend existing APIs and add a few new ones for storage purposes.
The consistency of compute and storage cluster membership simplifies the monitoring and troubleshooting of one’s infrastructure. It facilitates re-use of existing mechanisms such as vCenter alarms and tasks.

VSAN: Storage Consumption Model (today)

The predominant way for VMs to consume storage today is in the form of Virtual SCSI (VSCSI) Disks. This is the trick to ensure that any legacy application can run on any storage backend without compatibility concerns. In fact, VMware and all other virtualization vendors have IP to efficiently emulate SCSI controllers and devices in their hypervisors.

Again, the VSAN product has been packaged to support VMs and VSCSI disks. But with an important new twist: fine-grained policy-based storage provisioning and management (SPBM).

With VSAN, every VM or even every individual VMDK (VSCSI disk) is provisioned with its own individualized QoS properties. The user specifies, in the form of a policy, what they want and then VSAN automatically decides how to distribute each VMDK in the cluster and what resources to assign to meet the user’s requirements: capacity, flash space for caching and performance, number of replicas for availability, number of stripes, etc.

There are many benefits of this approach, which we have extensively talked about in the past. The main point I want to make here is how VSAN implements this fine-grained provisioning approach. Unlike VMFS and other similar products in the industry, VSAN is not a clustered file system. It is an object-based storage system. A VM consists of a number of objects. Think of an object as a self-contained unit of data + metadata, which may contain, for example, parts of or the entirety of a file system, the contents of a VSCSI disk, a swap file, etc. In that sense, VSAN is roughly similar to RADOS, the Ceph object backend.

OK, why is this important? Because VSAN is a generic object-based storage platform. It is not built exclusively for VSCSI disks and today’s virtualization use cases. Instead, the ESXi software modules for VSCSI disks and VM metadata (VMFS file system) are layered on top of the generic VSAN interface. It is worth noting that this object interface and control plane is what we opened up and turned into the VASA Virtual Volumes specification.

All these are done so by design. The plan is to use VSAN to serve more use cases down the road. This is where things are getting really interesting.

VSAN: The Road Ahead

The IT world is in the midst of a transition driven by software. We are witnessing dramatic changes in the way applications are developed, deployed and managed.

For one, we are moving from a model where we have a large number of smallish applications towards use cases with cloud-scale applications that that span 100s or 1000s of nodes and some times even geographic locations. New-gen apps (also called Cloud-Native Applications or 3^rd-Platform Applications) are structured out of many instances of fine-grained micro-services. They are not monolithic. Distribution, scaling and resource control is done at the granularity of micro services. Fault tolerance and availability are often implemented by the application itself.

The type of resource pooling and DRS/HA services built around vSphere clusters are not applicable to these applications. In fact, vSphere clusters are not even relevant as management abstractions anymore. We are talking about a completely different management model where the physical infrastructure is visible to and managed by the application itself, an approach that fits well with the DevOps model that comes hand-in-hand with this new generation of software.

These new use cases require fundamentally new data abstractions and storage management models.

When I think about these challenges, it helps me organize the problem space along two dimensions: Storage Infrastructure Management and Data Consumption Model.

Let’s look at the requirements of each of these areas in turn:

1. Storage Infrastructure Management at Scale

Dealing with storage infrastructures that consist of tens of thousands of servers and hundreds of thousands of storage devices is not science fiction anymore. How do we manage such massive infrastructures in a scalable yet effective way?

The following are the key principles for management at scale:

Tools to provide a bird’s-eye view of the infrastructure’s configuration and health. Allow for fast and effective “zoom in” to any problem areas and issues.
Use big-data analysis (yes, the same tools that some of the applications running on those infrastructures utilize) to provide ANSWERS to the users, not just piles of data.
Support dual interfaces:
- Single-pane of glass UI and visualization tools. Assist IT personnel with physical infrastructure troubleshooting and remediation.
- Programmatic interfaces (APIs) for integration with automation code and application logic (devOps model).

The architecture of traditional infrastructure management services needs to be re-thought drastically to accommodate the new storage requirements. To give you an idea of what I mean, consider the following simple arithmetic:

I assume that any one of us would be willing to dedicate a 10^th of a core on each host for running data analytics for infrastructure health monitoring, automated troubleshooting, trending, etc. It is a very reasonable “tax” to pay for the benefits of an automated service. For a “modest” infrastructure of 2,000 hosts, one would need 200 CPU cores! As a result, for storage infrastructure management at scale, it is required to have an architecture where the data collection and analysis is done in a distributed manner. A centralized management service may remain the central point of control and interface for data aggregation and presentation.

Distributed, scalable storage management architectures are the way of the future. The very same principles of Cloud-Native Applications are at work here. The distributed algorithm I am describing here is reminiscent of the Map-Reduce model. The algorithm can be hierarchically scaled to arbitrarily large infrastructures. Note, that the distributed nature of the architecture does not contradict the existence of a central “pane of glass” for visibility of the system’s state. On the contrary, the “Reduce” part of the algorithm results in concise and actionable information for the end user.

At VMware, we are busy designing a new generation of storage management services and tools that follow these principles. You will see some of those ideas applied for first time in the new VSAN management tools that will be announced at VMworld next week.

2. New Storage Consumption Models

Virtual SCSI emulation served us well for many years, while we have been running legacy applications in VMs. However, with Containers, whether they are run in VMs or natively, one utilizes an OS image, which is specially curated for their application needs. A vendor like Docker, VMware or CoreOS can utilize any driver they wish in the OS image – whether a lightweight block driver or a file system driver/client. Developers use the abstractions that make more sense for their applications and they package them together with the application in the container. No need for backward compatibility and legacy support.

It is in this context that the generic nature of VSAN comes very handy. VSAN is being extended as a platform to server data through abstractions other than just VSCSI disks. And it can do so for traditional VMs or for Containerized applications running on vSphere. Look for some very exciting announcements on this topic at VMworld!

VSAN could serve lightweight block drivers (perhaps using the NVMe protocol), native files, or even objects through a REST API. Different abstractions and protocols that are all supported from a single platform with a single management experience and set of tools—a converged storage platform.

The importance of VSAN as a converged storage platform is hard to miss:

Carries over the full extend of the Hardware-Compatibility List (HCL) of the current product (a major certification task that few companies can take on).
Offers a unified management model and tools, both for the physical infrastructure as well as for storage provisioning (SPBM).
Eliminates storage “islands”, which results in better resource utilization and efficiencies.

All the above while the user may run any combination of old-styled and new-gen applications, VMs or containers consuming storage through different protocols and abstractions.

Files are especially important for Container image management. The main requirement is for scalable, fast creation and deployment of near-identical images. The lack of open-source file systems with robust cloning features has led the community to utilize solutions such as “union” and “overlay” file systems, which have performance and manageability limitations. File system sharing is constrained within a single host. Deployment of containers involves shipping individual image copies to every single host, an inefficient and time consuming process.

Well then, what about a distributed file system designed to scale to infrastructures of thousands of hosts. A file system which can provide a practically unlimited number of clones (at file or volume granularity) created at O(1) time and accessible by any container and any VM on each of those hosts. If you think that this is too good to be true, then you should attend breakout session STO6050 – Virtual SAN: The Software-Defined Platform of the Future at VMworld. Or visit the VMware Office of the CTO Lounge to find out more about how such a file system can be designed taking advantage of VSAN’s object architecture and its infrastructure management services.

Looking forward to seeing you at VMworld!

Related Articles

VMware vSAN Max: Petabyte-Scale Disaggregated Storage

Hyperconverged Infrastructure (HCI): A Game-Changer for Modern Enterprises

VMware named a Leader in The Forrester Wave™: Hyperconverged Infrastructure, Q4 2023