Since the release of vSphere 7 Update 1c (Dec 17th, 2020), vSphere with Tanzu now supports configurable node storage for Tanzu Kubernetes clusters to mount additional storage volume to the worker nodes. This feature allows Tanzu Kubernetes clusters to run containerized applications that require a large amount of image size during deployment, for example, SQL Server 2019 big data clusters (BDC).
Update 2021/08/17: With SQL BDC CU12 release, BDC on Tanzu has been listed as a partner solution that is validated jointly by VMware, Microsoft and Dell EMC. For more details, please visit:
-
Announcing Microsoft SQL Server Big Data Clusters on VMware Tanzu Kubernetes Grid
- Microsoft SQL Server Big Data Clusters Partner Page
What is SQL Server 2019 Big Data Clusters?
SQL Server Big Data Clusters is Microsoft’s newest data platform that allows you to deploy scalable clusters of SQL Server, Spark, and HDFS containers running on Kubernetes. These components are running side by side to enable you to read, write, and process big data from Transact-SQL or Spark, allowing you to easily combine and analyze your high-value relational data with high-volume big data.
Why SQL Server Big Data Clusters on vSphere with Tanzu?
vSphere with Tanzu is the best solution that puts together the virtualization and Kubernetes into the same platform. Customers can still rely on traditional virtualized SQL Server applications to manage their mission-critical OLTP workload, while they are able to migrate certain SQL Server OLAP jobs onto the same vSphere with Tanzu platform, and what’s more, extend the data platform to process multiple external data sources through SQL Server BDC, for example, use SQL BDC to process big data on HDFS for machine learning jobs.
How to Deploy SQL Server Big Data Clusters on vSphere with Tanzu?
vSphere with Tanzu makes it easy to deploy and manage SQL Server BDC. The supported virtual machine classes for vSphere with Tanzu can be seen here – Virtual Machine Class Types for Tanzu Kubernetes Clusters. You may find that only the CPU and Memory can be changed with different virtual machine classes, but the storage size is fixed for a single 16 GB.
Prior to vSphere 7 U1c release, this amount of disk size is too small to store all the container images required for SQL Server BDC, as a result, you’ll see a new pod stuck in an endless loop of evicting and creating due to insufficient disk size shown as follows.
1 2 3 4 5 6 7 |
xmark@xmark-a02 ~ % kubectl get pods -n mssql-cluster NAME READY STATUS RESTARTS AGE appproxy-5sv49 0/2 ContainerCreating 0 6m14s appproxy-nsd5j 0/2 Evicted 0 16m compute-0-0 0/3 ContainerCreating 0 11m control-8p7dt 0/3 Evicted 0 21m ... |
1 |
Warning Evicted 14m kubelet, sqlbdc-cluster2-workers-sq98v-5c6d94cf8c-vz7hs The node was low on resource: ephemeral-storage. Container fluentbit was using 945447, which exceeds its request of 0. Container controller was using 1036Ki, which exceeds its request of 0. Container security-support was using 962134, which exceeds its request of 0. |
To configure additional storage volume on each of the TKG cluster worker nodes to allow the successful deployment of SQL Server BDC, you have to make sure your ESXi servers are running vSphere 7 u1c release (VMware ESXi, 7.0.1, 17325551) or above.
You may follow the vSphere with Tanzu Quick Start Guide to set up the supervisor cluster and create a TKG cluster that will be used for SQL Server BDC deployment. Here’s a sample yaml file to create a TKG cluster for SQL Server BDC, which created an additional 250GB disk on each of the TKG nodes.
sqlbdc-tkg.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
apiVersion: run.tanzu.vmware.com/v1alpha1 kind: TanzuKubernetesCluster metadata: name: sqlbdc-cluster namespace: sqlbdc spec: distribution: version: v1.18.5 topology: controlPlane: class: guaranteed-2xlarge count: 1 storageClass: vsan-default-storage-policy volumes: - name: etcd mountPath: /var/lib/etcd capacity: storage: 50Gi workers: class: guaranteed-8xlarge count: 3 storageClass: vsan-default-storage-policy volumes: - name: containerd mountPath: /var/lib/containerd capacity: storage: 250Gi |
You may create the TKG cluster with the sample yaml file with the following command:
1 |
kubectl apply -f sqlbdc-tkg.yaml |
After that, you should see a TKG cluster with 1 master node and 3 worker nodes created with an additional 250GB disk space.
Now you are ready to deploy SQL Server big data cluster within the TKG cluster. It is recommended to deploy SQL Server BDC with the Azure Data CLI (azdata) tool, and configuration scripts (bdc.json & control.json) which can be generated by Azure Data Studio notebook. Run the following command to create the SQL Server BDC cluster using azdata cli with the two config files.
1 |
azdata bdc create --accept-eula yes --config-profile sqlbdc/script |
Once the cluster is ready you should be able to see the screen like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
xmark@xmark-a02 ~ % azdata bdc create --accept-eula yes --config-profile sqlbdc/script The privacy statement can be viewed at: https://go.microsoft.com/fwlink/?LinkId=853010 The license terms for SQL Server Big Data Cluster can be viewed at: Enterprise: https://go.microsoft.com/fwlink/?linkid=2104292 Standard: https://go.microsoft.com/fwlink/?linkid=2104294 Developer: https://go.microsoft.com/fwlink/?linkid=2104079 Cluster deployment documentation can be viewed at: https://aka.ms/bdc-deploy Azdata username:admin Azdata password: Confirm Azdata password: NOTE: Cluster creation can take a significant amount of time depending on configuration, network speed, and the number of nodes in the cluster. Starting cluster deployment. Cluster controller endpoint is available at 10.40.xxx.xxx:30080. Cluster control plane is ready. Data pool is ready. Storage pool is ready. Compute pool is ready. Master pool is ready. Cluster 'mssql-cluster' deployed successfully. |
Check the SQL Server BDC pods status with the following command:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
xmark@xmark-a02 ~ % kubectl get pods -n mssql-cluster NAME READY STATUS RESTARTS AGE appproxy-9j9kg 2/2 Running 0 7m20s compute-0-0 3/3 Running 0 7m21s control-gsrw5 3/3 Running 0 10m controldb-0 2/2 Running 0 10m controlwd-87rrk 1/1 Running 0 9m48s data-0-0 3/3 Running 0 7m21s data-0-1 3/3 Running 0 7m21s gateway-0 2/2 Running 0 7m17s logsdb-0 1/1 Running 0 9m45s logsui-nvpcf 1/1 Running 0 9m47s master-0 4/4 Running 0 7m11s master-1 4/4 Running 0 7m11s master-2 4/4 Running 0 7m11s metricsdb-0 1/1 Running 0 9m46s metricsdc-7vm4v 1/1 Running 0 9m45s metricsdc-hxlrm 1/1 Running 0 9m45s metricsdc-zm45d 1/1 Running 0 9m45s metricsui-dpdvl 1/1 Running 0 9m44s mgmtproxy-cmsdn 2/2 Running 0 9m46s nmnode-0-0 2/2 Running 0 7m21s nmnode-0-1 2/2 Running 0 7m21s operator-hdvb4 1/1 Running 0 7m14s sparkhead-0 4/4 Running 0 7m3s sparkhead-1 4/4 Running 0 7m3s storage-0-0 4/4 Running 0 7m3s storage-0-1 4/4 Running 0 7m3s storage-0-2 4/4 Running 0 7m3s zookeeper-0 2/2 Running 0 7m18s zookeeper-1 2/2 Running 0 7m18s zookeeper-2 2/2 Running 0 7m18s |
Now you have successfully deployed SQL Server 2019 big data clusters on vSphere with Tanzu. For more details about reference architecture for SQL Server on VMware platform, please visit https://core.vmware.com/business-critical-application-reference-architectures.