The Container Service Extension 4.0 has been released with several significant improvements and additional use cases, including Cluster API, lifecycle management through a user interface, GPU support for Kubernetes clusters, and integration with VMware Cloud Director as infrastructure. With its feature-rich user interface, customers can perform operations such as creation, scaling, and upgrading on Tanzu Kubernetes clusters. However, some customers may seek automation support for these same operations.
This blog post is intended for customers who are looking to automate the provisioning of Tanzu Kubernetes clusters on the VMware Cloud Director Tenant portal using the VMware Cloud Director API. Although the VCD API is supported, the blog post is necessary because the Cluster API is used to create and manage TKG clusters on VCD. The payload required to perform operations on TKG clusters requires some work to provide the Cluster API-generated payload. The blog post outlines the step-by-step process for generating the correct payload for customers using their VCD infrastructure.
Version Support:
This API guide is applicable to clusters created by CSE 4.0 and CSE 4.0.1 Tanzu Kubernetes Clusters.
The existing prerequisites for customers to create TKG clusters in their organizations also apply to the automation flow. These prerequisites are summarized here and can be found in the official documentation to onboard Provider and Tenant Admin users. The following sections provide an overview of the requirements for both cloud provider administrators and Tenant Admin users.
Cloud Provider Admin Steps
The Steps to onboard the customers is demonstrated in this video and documented here. Once customer organization and its users are onboarded, they can use next section to use APIs, or consume it to create automated Cluster operations.
As a quick summary following steps are expected to be performed by cloud provider to onboard and prepare the customer:
- Review Interoperability Matrix to support Container Service Extension 4.0 and 4.0.1
- Allow necessary communication for CSE server
- Start CSE server and Onboard customer organization (Reference Demo and Official Documentation)
Customer Org Admin Steps
When the cloud provider has onboarded the customer onto the Container Service Extension, the organization administrator must create and assign users with the capability to create and manage TKG clusters for the customer organization. This documentation outlines the procedure for creating a user with the “Kubernetes cluster author” role within the tenant organization.
It is then assumed that the user “acmekco” has obtained the necessary resources and access within the customer organization to execute Kubernetes cluster operations.
Generate ‘capiyaml’ payload
- Collect VCD Infrastructure and Kubernetes Cluster details
This Operation requires following information for VCD tenant portal. The right column describes example values used as reference in this blog post.
Input | Example value for this blog |
VCD_SITE | VCD Address (https://vcd-01a.local) |
VCD_ORGANIZATION | Customer Organization name(ACME) |
VCD_ORGANIZATION_VDC | Customer OVDC name (ACME_VDC_T) |
VCD_ORGANIZATION_VDC_NETWORK | Network name in customer org (172.16.2.0) |
VCD_CATALOG | CSE shared catalog name (cse) |
Input | Example value for this blog |
VCD_TEMPLATE_NAME | Kubernetes and TKG version of the cluster(Ubuntu 20.04 and Kubernetes v1.22.9+vmware.1) |
VCD_CONTROL_PLANE_SIZING_POLICY | Sizing policy of control plane vms(TKG small) |
VCD_CONTROL_PLANE_STORAGE_PROFILE | Storage profile for control plane of the cluster (Capacity) |
VCD_CONTROL_PLANE_PLACEMENT_POLICY | Optional – Leave empty if not using |
VCD_WORKER_SIZING_POLICY | Sizing policy of worker nodes vms(TKG small) |
VCD_WORKER_PLACEMENT_POLICY | Optional – Leave empty if not using |
VCD_WORKER_STORAGE_PROFILE | Storage profile for control plane of the cluster (Capacity) |
CONTROL_PLANE_MACHINE_COUNT | 1 |
WORKER_MACHINE_COUNT | 1 |
VCD_REFRESH_TOKEN_B64 | “MHB1d0tXSllVb2twU2tGRjExNllCNGZnVWZqTm5UZ2U=” Ref VMware Doc to Generate token before transforming it to Base64 |
- Install required tools to generate the
capiyaml
. User can use any Operating System or a Virtual Machine(including Linux, Mac or Windows) to generate the payload. - Once the tenant user has collected all the information, user will have to install following components such as Clusterctl 1.1.3, Kind(0.17.0), and Docker (20.10.21) on end user’s machine. The following step requires above collected information, and not the access to VCD Infrastructure to generate capiyaml payload.
- Copy TKG CRS Files locally. Incase the TKG version is missing from the folder, make sure you have the templates created for the desired TKG versions. The Following table provides supported list of etc, coredns, tkg, tkr versions for CSE 40 and CSE 4.0.1 release. Alternatively this script to fetch the same values from Tanzu Kubernetes Grid resources.
Kubernetes Version | Etcd ImageTag | CoreDNS ImageTag | Complete Unique Version | OVA | TKG Product Version | TKr version |
v1.22.9+vmware.1 | v3.5.4_vmware.2 | v1.8.4_vmware.9 | v1.22.9+vmware.1-tkg.1 | ubuntu-2004-kube-v1.22.9+vmware.1-tkg.1-2182cbabee08edf480ee9bc5866d6933.ova | 1.5.4 | v1.22.9—vmware.1-tkg.1 |
v1.21.11+vmware.1 | v3.4.13_vmware.27 | v1.8.0_vmware.13 | v1.21.11+vmware.1-tkg.2 | ubuntu-2004-kube-v1.21.11+vmware.1-tkg.2-d788dbbb335710c0a0d1a28670057896.ova | 1.5.4 | v1.21.11—vmware.1-tkg.3 |
v1.20.15+vmware.1 | v3.4.13_vmware.23 | v1.7.0_vmware.15 | v1.20.15+vmware.1-tkg.2 | ubuntu-2004-kube-v1.20.15+vmware.1-tkg.2-839faf7d1fa7fa356be22b72170ce1a8.ova | 1.5.4 | v1.20.15—vmware.1-tkg.2 |
- Create a folder structure
~/infrastructure-vcd/v1.0.0/
in your working directory
1 2 3 4 |
mkdir ~/infrastructure-vcd/ cd ~/infrastructure-vcd mkdir v1.0.0 cd v1.0.0 |
- Copy the contents from templates directory to
~/infrastructure-vcd/v1.0.0/
- Copy metadata.yaml to
~/infrastructure-vcd/v1.0.0/
- After copying all files the folder structure should look as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
v1.0.0% ls -lrta total 280 drwxr-xr-x 3 bhatts staff 96 Jan 30 16:41 .. drwxr-xr-x 6 bhatts staff 192 Jan 30 16:42 crs -rw-r--r-- 1 bhatts staff 9073 Jan 30 16:56 cluster-template-v1.20.8-crs.yaml -rw-r--r-- 1 bhatts staff 9099 Jan 30 16:56 cluster-template-v1.20.8.yaml -rw-r--r-- 1 bhatts staff 9085 Jan 30 16:57 cluster-template-v1.21.8-crs.yaml -rw-r--r-- 1 bhatts staff 9023 Jan 30 16:57 cluster-template-v1.21.8.yaml -rw-r--r-- 1 bhatts staff 9081 Jan 30 16:57 cluster-template-v1.22.9-crs.yaml -rw-r--r-- 1 bhatts staff 9019 Jan 30 16:57 cluster-template-v1.22.9.yaml -rw-r--r-- 1 bhatts staff 9469 Jan 30 16:57 cluster-template.yaml -rw-r--r-- 1 bhatts staff 45546 Jan 30 16:58 infrastructure-components.yaml -rw-r--r-- 1 bhatts staff 165 Jan 30 16:58 metadata.yaml -rw-r--r-- 1 bhatts staff 3355 Jan 30 18:53 clusterctl.yaml drwxr-xr-x 13 bhatts staff 416 Jan 30 18:53 . crs % ls -lrta total 0 drwxr-xr-x 6 bhatts staff 192 Jan 30 16:42 . drwxr-xr-x 4 bhatts staff 128 Jan 30 16:42 tanzu drwxr-xr-x 4 bhatts staff 128 Jan 30 16:51 cni drwxr-xr-x 4 bhatts staff 128 Jan 30 16:54 cpi drwxr-xr-x 6 bhatts staff 192 Jan 30 16:55 csi drwxr-xr-x 13 bhatts staff 416 Jan 30 18:53 .. |
- Copy the
~/infrastructure-vcd/v1.0.0/clusterctl.yaml
to~/.cluster-api/clusterctl.yaml.
- The ‘
clusterctl
‘ command usesclusterctl.yaml
from~/.cluster-api/clusterctl.yaml
to create the capiyaml payload. Update the infrastructure details from the first step in this document. - Update the
providers.url
in~/.cluster-api/clusterctl.yaml
to~/infrastructure-vcd/v1.0.0/infrastructure-components.yaml.
1 2 3 4 |
providers: - name: "vcd" url: "~/infrastructure-vcd/v1.0.0/infrastructure-components.yaml" type: "InfrastructureProvider" |
At this point, we will need a kind cluster to install clusterctl to generate the payload. In this step, create Kind cluster to generate capiyaml payload and initialize clusterctl as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
cat > kind-cluster-with-extramounts.yaml <<EOF kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 nodes: - role: control-plane extraMounts: - hostPath: /var/run/docker.sock containerPath: /var/run/docker.sock EOF Create a local cluster on mac // This can be similarly executed on choice of your operating system. kind create cluster --config kind-cluster-with-extramounts.yaml kubectl cluster-info --context kind-kind kubectl config set-context kind-kind kubectl get po -A -owide clusterctl init --core cluster-api:v1.1.3 -b kubeadm:v1.1.3 -c kubeadm:v1.1.3 vcd:v1.0.0 |
Update the below tkg labels to “Kind: Cluster” object and annotations.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
OLD Metadata: apiVersion: cluster.x-k8s.io/v1beta1 kind: Cluster metadata: labels: ccm: external cni: antrea csi: external name: api5 namespace: default New Metadata: apiVersion: cluster.x-k8s.io/v1beta1 kind: Cluster metadata: labels: cluster-role.tkg.tanzu.vmware.com/management: "" tanzuKubernetesRelease: v1.21.8---vmware.1-tkg.2 tkg.tanzu.vmware.com/cluster-name: api5 annotations: osInfo: ubuntu,20.04,amd64 TKGVERSION: v1.4.3 name: api5 namespace: api5-ns |
- At this point, the capiyaml is ready to be consumed by VCD APIs to perform various operations. For verification, make sure cluster name, namespace values are consistent. Copy the content of capiyaml to generate jsonstring using similar tool as here.
Cluster Operations
Following section describes all supported API operations for Tanzu Kubernetes Cluster on VMware Cloud Director:
List Clusters
List all clusters in the customer organization. for CSE 4.0 release the CAPVCD version is 1.
1 |
GET https://{{vcd}}/cloudapi/1.0.0/entities/types/vmware/capvcdCluster/1 |
Info Cluster
Filter Cluster by name
1 |
GET https://{{vcd}}/cloudapi/1.0.0/entities/types/vmware/capvcdCluster/1?filter=name==clustername |
Get cluster by ID:
1 |
GET https://{{vcd}}/cloudapi/1.0.0/entities/id |
Get Kubeconfig of the cluster:
1 |
GET https://{{vcd}}/cloudapi/1.0.0/entities/id |
The Kubeconfig can be found as follows at: entity.status.capvcd.private.kubeconfig
Create a new Cluster
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
POST https://{{vcd}}/cloudapi/1.0.0/entityTypes/urn:vcloud:type:vmware:capvcdCluster:1.1.0 "entityType": "urn:vcloud:type:vmware:capvcdCluster:1.1.0", "name": "demo", "externalId": null, "entity": { "kind": "CAPVCDCluster", "spec": { "vcdKe": { "isVCDKECluster": true, "markForDelete": false, "forceDelete": false, "autoRepairOnErrors": true }, "capiYaml": ""apiVersion: cluster.x-k8s.io/v1beta1\nkind: Cluster\nmetadata:\n labels:\n cluster-role.tkg.tanzu.vmware.com/management: \"\"\n tanzuKubernetesRelease: v1.22.9---vmware.1-tkg.2\n tkg.tanzu.vmware.com/cluster-name: api4\n name: api4\n namespace: api4-ns\n annotations:\n osInfo: ubuntu,20.04,amd64\n TKGVERSION: v1.5.4\nspec:\n clusterNetwork:\n pods:\n cidrBlocks:\n - 100.96.0.0/11\n serviceDomain: cluster.local\n services:\n cidrBlocks:\n - 100.64.0.0/13\n controlPlaneRef:\n apiVersion: controlplane.cluster.x-k8s.io/v1beta1\n kind: KubeadmControlPlane\n name: api4-control-plane\n namespace: api4-ns\n infrastructureRef:\n apiVersion: infrastructure.cluster.x-k8s.io/v1beta1\n kind: VCDCluster\n name: api4\n namespace: api4-ns\n---\napiVersion: v1\ndata:\n password: \"\"\n refreshToken: WU4zdWY3b21FM1k1SFBXVVp6SERTZXZvREFSUXQzTlE=\n username: dG9ueQ==\nkind: Secret\nmetadata:\n name: capi-user-credentials\n namespace: api4-ns\ntype: Opaque\n---\napiVersion: infrastructure.cluster.x-k8s.io/v1beta1\nkind: VCDCluster\nmetadata:\n name: api4\n namespace: api4-ns\nspec:\n loadBalancerConfigSpec:\n vipSubnet: \"\"\n org: stark\n ovdc: vmware-cloud\n ovdcNetwork: private-snat\n site: https://vcd.tanzu.lab\n useAsManagementCluster: false\n userContext:\n secretRef:\n name: capi-user-credentials\n namespace: api4-ns\n---\napiVersion: infrastructure.cluster.x-k8s.io/v1beta1\nkind: VCDMachineTemplate\nmetadata:\n name: api4-control-plane\n namespace: api4-ns \nspec:\n template:\n spec:\n catalog: CSE-Templates\n diskSize: 20Gi\n enableNvidiaGPU: false\n placementPolicy: null\n sizingPolicy: TKG small\n storageProfile: lab-shared-storage\n template: Ubuntu 20.04 and Kubernetes v1.22.9+vmware.1\n---\napiVersion: controlplane.cluster.x-k8s.io/v1beta1\nkind: KubeadmControlPlane\nmetadata:\n name: api4-control-plane\n namespace: api4-ns\nspec:\n kubeadmConfigSpec:\n clusterConfiguration:\n apiServer:\n certSANs:\n - localhost\n - 127.0.0.1\n controllerManager:\n extraArgs:\n enable-hostpath-provisioner: \"true\"\n dns:\n imageRepository: projects.registry.vmware.com/tkg\n imageTag: v1.8.4_vmware.9\n etcd:\n local:\n imageRepository: projects.registry.vmware.com/tkg\n imageTag: v3.5.4_vmware.2\n imageRepository: projects.registry.vmware.com/tkg\n initConfiguration:\n nodeRegistration:\n criSocket: /run/containerd/containerd.sock\n kubeletExtraArgs:\n cloud-provider: external\n eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%\n joinConfiguration:\n nodeRegistration:\n criSocket: /run/containerd/containerd.sock\n kubeletExtraArgs:\n cloud-provider: external\n eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%\n users:\n - name: root\n sshAuthorizedKeys:\n - \"\"\n machineTemplate:\n infrastructureRef:\n apiVersion: infrastructure.cluster.x-k8s.io/v1beta1\n kind: VCDMachineTemplate\n name: api4-control-plane\n namespace: api4-ns\n replicas: 1\n version: v1.22.9+vmware.1\n---\napiVersion: infrastructure.cluster.x-k8s.io/v1beta1\nkind: VCDMachineTemplate\nmetadata:\n name: api4-md-0\n namespace: api4-ns\nspec:\n template:\n spec:\n catalog: CSE-Templates\n diskSize: 20Gi\n enableNvidiaGPU: false\n placementPolicy: null\n sizingPolicy: TKG small\n storageProfile: lab-shared-storage\n template: Ubuntu 20.04 and Kubernetes v1.22.9+vmware.1\n---\napiVersion: bootstrap.cluster.x-k8s.io/v1beta1\nkind: KubeadmConfigTemplate\nmetadata:\n name: api4-md-0\n namespace: api4-ns\nspec:\n template:\n spec:\n joinConfiguration:\n nodeRegistration:\n criSocket: /run/containerd/containerd.sock\n kubeletExtraArgs:\n cloud-provider: external\n eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%\n users:\n - name: root\n sshAuthorizedKeys:\n - \"\"\n---\napiVersion: cluster.x-k8s.io/v1beta1\nkind: MachineDeployment\nmetadata:\n name: api4-md-0\n namespace: api4-ns\nspec:\n clusterName: api4\n replicas: 1\n selector:\n matchLabels: null\n template:\n spec:\n bootstrap:\n configRef:\n apiVersion: bootstrap.cluster.x-k8s.io/v1beta1\n kind: KubeadmConfigTemplate\n name: api4-md-0\n namespace: api4-ns\n clusterName: api4\n infrastructureRef:\n apiVersion: infrastructure.cluster.x-k8s.io/v1beta1\n kind: VCDMachineTemplate\n name: api4-md-0\n namespace: api4-ns\n version: v1.22.9+vmware.1\n" }, "apiVersion": "capvcd.vmware.com/v1.1" } } |
Resize a Cluster
1 |
GET https://{{vcd}}/cloudapi/1.0.0/entities/types/vmware/capvcdCluster/1?filter=name==clustername |
- Fetch the Cluster ID(
"id": "urn:vcloud:entity:vmware:capvcdCluster:<ID>
) from the above API call’s output. - Copy the complete output of the API response.
- Notedown eTag Value from API response header
- Modify “capiyaml” with following values:
- To resize Control Plane VMs Modify
kubeadmcontrolplane.spec.replicas
with desired number of control plane vms. Note only odd numbers of control plane are supported. - To resize Worker Plane VMS Modify
MachineDeployment.spec.replicas
with desired number of worker plane VMs
- To resize Control Plane VMs Modify
- While performing the
PUT
API call, ensure to include fetched eTag value as If-Match
1 2 3 4 5 6 |
PUT https://{{vcd}}/cloudapi/1.0.0/entities/{cluster-id from the GET API response} headers: Accept: application/json; value=37.0 Authorization: Bearer Token {token} If-Match: {eTag value from previous GET call} BODY: Copy entire body from the previous GET call, modify capiyaml values as described in above Modify step. |
Upgrade a Cluster
To Upgrade a cluster, Provider admin needs to publish desired the Tanzu Kubernetes templates to the customer organization in catalog used by Container Service Extension.
collect the GET API response for the cluster to be upgraded as follows:
1 |
GET https://{{vcd}}/cloudapi/1.0.0/entities/types/vmware/capvcdCluster/1?filter=name==clustername |
- Fetch the Cluster ID(
"id": "urn:vcloud:entity:vmware:capvcdCluster:<ID>
) from the above API call’s output. - Copy the complete output of the API response.
- Notedown eTag Value from API response header
- The customer user performing cluster upgrade will require access to Table 3 information. Modify Following values matching the target TKG version. The Following table shows Upgrade for TKG version 1.5.4 from v1.20.15+vmware.1 to v1.22.9+vmware.1
Control Plane Version | Old Values | New Values |
VCDMachineTemplate | ||
VCDMachineTemplate.spec.template.spec.template | Ubuntu 20.04 and Kubernetes v1.20.15+vmware.1 | Ubuntu 20.04 and Kubernetes v1.22.9+vmware.1 |
KubeadmControlPlane | ||
KubeadmControlPlane.spec.version | v1.20.15+vmware.1 | v1.22.9+vmware.1 |
KubeadmControlPlane.spec.kubeadmConfigSpec.dns | imageTag: v1.7.0_vmware.15 | v1.8.4_vmware.9 |
KubeadmControlPlane.spec.kubeadmConfigSpec.etcd | v3.4.13_vmware.23 | v3.5.4_vmware.2 |
KubeadmControlPlane.spec.kubeadmConfigSpec.imageRepository | imageRepository: projects.registry.vmware.com/tkg | imageRepository: projects.registry.vmware.com/tkg |
Worker Node Version | ||
VCDMachineTemplate | ||
VCDMachineTemplate.spec.template.spec.template | Ubuntu 20.04 and Kubernetes v1.20.15+vmware.1 | Ubuntu 20.04 and Kubernetes v1.22.9+vmware.1 |
VCDMachineTemplate.spec.template.spec | ||
MachineDeployment | ||
MachineDeployment.spec.version | v1.20.15+vmware.1 | v1.22.9+vmware.1 |
- While performing the
PUT
API call, ensure to include fetched eTag value as If-Match
1 2 3 4 5 6 |
PUT https://{{vcd}}/cloudapi/1.0.0/entities/{cluster-id from the GET} headers: Accept: application/json; value=37.0 Authorization: Bearer Token &lt;token> If-Match: &lt; eTag value from previous GET call> BODY: Copy entire body from the previous GET call, modify capiyaml values as described in above step to modify capiyaml. |
Delete a Cluster
1 |
GET https://{{vcd}}/cloudapi/1.0.0/entities/types/vmware/capvcdCluster/1?filter=name==clustername |
- Fetch the Cluster ID(
"id": "urn:vcloud:entity:vmware:capvcdCluster:<ID>
) from the above API call’s output. - Copy the complete output of the API response.
- Notedown eTag Value from API response header
- Add or modify the following fields to delete or forcefully delete the cluster under entity.spec.vcdke:
- “markForDelete”: true, –> Set the value to true to delete the cluster
- “forceDelete”: true, –> Set this value to true for Forceful deletion of a cluster
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
PUT https://{{vcd}}/cloudapi/1.0.0/entities/{cluster-id from the GET API response} { "entityType": "urn:vcloud:type:vmware:capvcdCluster:1.1.0", "name": "demo", "externalId": null, "entity": { "kind": "CAPVCDCluster", "spec": { "vcdKe": { "isVCDKECluster": true, --Add or modify the this field to delete the cluster "markForDelete": true, -- Add or modify the this field to force delete the cluster "forceDelete": false, "autoRepairOnErrors": true }, "capiYaml": "<Your capYaml payload generated from Step 5> }, . . . . #Other payload from the GET API response . . . "org": { "name": "acme", "id": "urn:vcloud:org:cd11f6fd-67ba-40e5-853f-c17861120184" } } |
Recommendation for API Usage during automation
- DO NOT hardcode API urls with RDE versions. ALWAYS parameterize RDE versions. For example:
POST https://{{vcd}}/cloudapi/1.0.0/entityTypes/urn:vcloud:type:vmware:capvcdCluster:1.1.0
Ensure to declare 1.1.0
as a variable. This will ensure easy API client upgrades to future versions of CSE.
- Ensure the API client code ignores any unknown/additional properties while unmarshaling the API response
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
#For example, capvcdCluster 1.1.0 API payload looks like below { status: { kubernetesVersion: 1.20.8, nodePools: {} } } #In the future, next version of capvcdCluster 1.2.0 may add more properties ("add-ons") to the payload. # The old API client code must ensure it does not break on seeing newer properties in the future payloads. { status: { kubernetesVersion: 1.20.8, nodePools: {}, add-ons: {} // new property in the future version } } |
Summary
To summarize, we looked at CRUD operations for a Tanzu Kubernetes clusters on VMware Cloud Director platform using VMware Cloud Director supported APIs. Please feel free to checkout other resources for Container Service Extension as follows: