KEVIN LEES
Chief Technologist, IT Operations Transformation
VMware recently debuted in the Leaders quadrant in the Gartner Magic Quadrant for Hyper-Converged Infrastructure (HCI). This position reflects Gartner’s recognition of the critical role software-defined technologies play in HCI solutions. As VMware’s software-defined storage solution, VMware vSAN is one of the key components and enablers of our software-defined HCI stack. And it’s seeing rapid adoption by our customers due to the agility, scalability, and cost efficiency it affords. But, as with NSX and our software-defined datacenter solutions generally, the question I always get is “we love vSAN as a technology, but what’s the impact on operations?” They inherently know something needs to be done differently in operations to fully leverage the power of software-defined storage, but what exactly? This is where “operationalizing vSAN” comes into play.
Operationalizing vSAN refers to what happens after you’ve designed and implemented vSAN in your infrastructure. You may think, well isn’t “day 2 operations” what happens after we’ve designed and implemented vSAN? You’re right, day 2 operations do happen once vSAN is deployed but to really leverage the capabilities provided by vSAN as well as the Software Defined Datacenter and related software-defined solutions generally, you need to think beyond just day 2 operations.
How do you optimize the way you operate vSAN? Are you currently organized in such a way as to fully take advantage of what vSAN provides? What skills are needed? What about your current operational processes? Could you optimize your operational processes to unlock the full benefits of vSAN? What about operational tools for vSAN? Can you leverage existing investments in VMware tools? The answers to these questions are what we mean by operationalizing vSAN.
One of the biggest keys, if not THE key, to success with vSAN lay in the people perspective – organization, i.e., siloes, and skills. In the larger software-defined datacenter context, the ideal is to create a blended team both from a functional (think plan-build-run siloes) and a technical perspective Figure 1.
If vSAN is being added to an existing software-defined datacenter environment, that means storage capabilities need to be represented in that blended team. If you are just starting out or if there are other compelling reasons (can you say “political”) you can’t easily create a blended team, it’s still critical to create much closer collaboration across teams. This is oftentimes easier to solve in a cross-technology sense and has even been used by some customers as an opportunity to unify their technical siloes into a single team. But, more often than not it is ignored when it comes to closer collaboration across plan-build-run, especially as it relates to the run component. Why is this so important? A frequent example we see is: the Plan team chooses vSAN (great choice!), the Build team implements it (so far so good), but then it’s “thrown over the wall” to operations who was not part of the plan and build process. As a result, when they start running vSAN they’re rebooting servers and taking other actions that are appropriate generally in their virtualized environment, but ones that can wreak havoc in a vSAN storage cluster. As is true with a software-defined datacenter environment generally, you absolutely must involve operations throughout the plan and build phases to truly be successful. This is happening by default, of course, if the team is already blended.
The second people-related perspective has to do with skills. As with NSX, the vSAN question is always, should the virtualization team manage vSAN storage clusters or should the storage team? The right answer is neither, it should be managed by a blended team. But that aside, let’s talk skills. Can a vSphere administrator successfully manage a vSAN storage cluster or should it be a storage administrator? We have found the optimal to be a storage administrator who also knows vSphere – say a storage administrator with a VCP certification at a minimum. While vSAN is simple to manage, it does require core storage knowledge to reduce risk and truly be successful. A common example is the need to understand the impacts of deduplication, compression, the RAID type, or Failure To Tolerate (FTT), setting on capacity generally not to mention the impact remaining capacity has when making decisions to change storage policies say from RAID 1 to RAID 5 for all VMs in a vSAN storage cluster. We’ve seen what happens when this has been left to a vSphere administrator without the requisite storage knowledge and it hasn’t been, shall we say, optimal.
What about operational processes and capabilities? A few quick examples of vSAN’s impact are workload provisioning, capacity management, and change management. A Storage Policy-Based Management (SPBM) plug-in became available for VMware vRealize Automation 6 and has been increasingly expanded with newer versions of vRealize Automation 7, most recently enabling access to all Storage Policy-Based objects via the API in vRealize Automation 7.3. SPBM integration allows associating storage policies with workloads during provisioning providing both OPEX savings as well as increased end-user consumer functionality opportunities. Capacity management, for example, is simplified through the flexibility provided in adding capacity: either scaling up through adding more storage devices or higher density storage devices to each host in the vSAN cluster or by scaling out through adding hosts to the vSAN cluster. And, while storage policy changes applied to many VMs should be considered a “normal” change, changing the storage policy of a single VM or its virtual disk could be handled as a “standard” change potentially increasing agility and improving time to value when making a change for a line-of-business.
Finally, vSAN allows you to leverage the investment you’ve already made in VMware tools.
A core set of vSAN monitoring and troubleshooting capabilities are included directly in vCenter Server, such as the vSAN Cluster Health Check shown in Figure 2 and vSAN Cluster capacity monitoring in Figure 3.
You can also leverage existing investment in VMware vRealize Operations by adding the vRealize Operations Management Pack for vSAN (Note: as of vRealize Operations 6.6 you no longer need to install the management pack) as well as VMware vRealize Log Insight by adding the Content Pack for vSAN. The vRealize Operations Management Pack for vSAN employs the vCenter and vSAN APIs to collect data and its sophisticated analytics engine to manipulate and analyze the data in ways not possible using the vSphere tools alone. For example, Figure 4 shows the vSAN Operations Overview dashboard which in addition to key vSAN storage metrics such as IOPS, throughput, and latency, also provides other metrics that contribute to the health and well-being of the vSAN cluster itself, such as the host count, CPU and Memory utilization, and alert volume.
The vRealize Log Insight Content Pack for vSAN provides operational reporting, trending, and alerting visibility for all log data within vSAN. This content pack provides “last mile” troubleshooting to compliment the monitoring and troubleshooting capabilities of vRealize Operations.
In closing, successfully leveraging vSAN, as well as the software defined data center more broadly, depends on making operating model changes that affect people, process, and tools. The most important place to focus first is the people perspective both by creating an environment of deep collaboration as well as filling roles with the right skillsets. If you’ve already made process changes to fully leverage a software-defined datacenter environment you’re well on your way and may only need tweaks to include vSAN. If you’re just starting out, you should review your existing processes for opportunities to optimize, starting with core processes like workload provisioning, capacity management, and change management. Finally, you can easily leverage your existing investment in VMware tools like vRealize Operations and vRealize Log Insight through the addition of the vRealize Operations Management Pack for vSAN and vRealize Log Insight Content Pack for vSAN respectively. With the right people, processes, and tools you’ll be well on your way to success.
Kevin Lees is the field Chief Technologist for IT Operations Transformation at VMware. His focus is on how customers optimize the way they operate VMware-supported environments and solutions. Kevin serves as an advisor to global customer senior executives for their IT operations transformation initiatives. He also leads the IT Transformation activities in VMware’s Global Field Office of the CTO. He is the author of the book Operationalizing VMware NSX which provides knowledge and guidance for achieving operating model optimization for operating a NSX-based network and security infrastructure. The book not only addresses tactical optimizations such as monitoring and troubleshooting but through a more strategic nature, such as team structure and culture, roles, responsibilities, and skillsets, as well as supporting ITSM process considerations. It can be downloaded from vmware.com here.