SQL Server Azure VMware Solutions (AVS) Cloud Native enterprice workload ESXi Google Cloud VMware Engine Kubernetes Microsoft VMware Cloud on AWS vSphere

Automated Deployment of Clustered MS SQL Server on Linux in a VMware vSphere Infrastructure

When Microsoft released a Linux-based version of its flagship SQL Server in 2017, it signaled an unmistakable intention and commitment to expand its reach beyond the familiar Windows terrains. Suddenly, Microsoft SQL Server Administrators found themselves needing to grapple with (and learn) new ways of deploying, configuring and administering one of the most complex RDBMS in the Enterprise. Because high availability, resilience and recoverability are common requirements for mission-critical applications (especially Databases), most enterprises tend to cluster their MS SQL Servers in production. Achieving this desired outcome of High Availability has been quite challenging.

The traditional clustering options for Microsoft SQL Server are built around (and on top of) the Microsoft Windows Server Failover Clustering (WSFC) technologies. However, since MS SQL Server on Linux isn’t a Windows application, Microsoft had to leverage external, 3rd-party utilities and components to satisfy this expectation of resilience for this new version of SQL Server. These components include PaceMaker, Corosync and STONITH. Whereas clustering Microsoft SQL Server nodes in Windows is a comparatively easy task, the same cannot be said of doing so with Microsoft SQL Server on Linux, mainly due to the lack of cohesion and integration between and among the various components and dependencies.

As Microsoft SQL Server on Linux becomes more accepted in the Enterprises, VMware vSphere Administrators and traditional MS SQL Server Architects have found themselves struggling to successfully and optimally design, configure and deploy Production-ready clusters of Microsoft Server on Linux in their infrastructure. The existing documentation have proved to be too incomplete, disjointed, inconsistent and (often) outdated to allow for consistent and repeatable deployment plans. One of the most common challenges in clustering MS SQL Server on Linux nodes is how to properly and successfully configure the Fencing Agent in a vSphere environment to provide the functionality required by STONITH. Most documentation side-step this critically important part of the subject, leaving the readers lost in the middle of their deployment project.

VMware vSphere Administrators (especially) have been requesting for more detailed and structured guidance or tools to help them help their Application Owners to successfully deploy clustered MS SQL Servers on Linux on vSphere, and we’re happy to announce the availability of what we believe is a robust response to these requests.

Rather than just provide even more confusing literature around the subject, we took a different approach to solving this problem by providing a fully-orchestrated deployment Package that can be used by anyone to deploy a fully-functional cluster of MS SQL Server nodes running on Ubuntu 20.04.03. Using this Package, even a novice will be able to successfully deploy up to 9 VMs, configured for High Availability, using Microsoft’s Always On Availability Group, with 3 of the Nodes configured for SYNCHRONOUS Replication (NOTE: This depends on the Edition selected during the automated deployment process provided here). We have tried to the best of our abilities to structure and comment the scripts/codes in such a way as to encourage inspection, understanding and extension – feel free to use the Package as-is, or select a snippet for re-use in your own Solution.

This Infrastructure-as-a-Code Package (using Terraform and Ansible) automates all the steps, logic and configuration and requires only minimal efforts from the Administrator/Operator and the entire process takes just a little over 30 minutes, depending on environmental factors. The Source Code, Instructions and Known Issues are available here, and you can download the Automation System here – it’s an OVA package which serves as the Control VM containing the bits (Centos Stream 9, Python 3.9.9, Terraform v1.1.4, Ansible 4.10.0 and Ansible-core 2.11.7) required for initiating and completing a deployment project.

This Package and its associated Codes are released without any guarantee of fit or official support implied. They are not intended to replace any standard corporate administrative practices, nor are the deployed VMs expected to be used for Production workloads without undergoing additional rigorous auditing and configurations deemed necessary by the end-user.

We hope that these will serve as a useful starting point and building block for all Microsoft SQL Server and VMware vSphere Administrators who may have been struggling with getting similar Solutions successfully deployed in their environments or have been wondering if it’s “doable on vSphere”.

This is the initial public release of this Solution, and we are aware that it does not satisfy all deployment and configuration scenarios. To help us improve upon this release, we ask that you please remember to please provide feedback on your experience. Happy Clustering!