Container Service Extension Cluster API Developer Ready Clouds with Tanzu VMware Cloud Provider

API Guide for Tanzu Kubernetes Clusters for VMware Cloud Director

The Container Service Extension 4.0 has been released with several significant improvements and additional use cases, including Cluster API, lifecycle management through a user interface, GPU support for Kubernetes clusters, and integration with VMware Cloud Director as infrastructure. With its feature-rich user interface, customers can perform operations such as creation, scaling, and upgrading on Tanzu Kubernetes clusters. However, some customers may seek automation support for these same operations.

This blog post is intended for customers who are looking to automate the provisioning of Tanzu Kubernetes clusters on the VMware Cloud Director Tenant portal using the VMware Cloud Director API. Although the VCD API is supported, the blog post is necessary because the Cluster API is used to create and manage TKG clusters on VCD. The payload required to perform operations on TKG clusters requires some work to provide the Cluster API-generated payload. The blog post outlines the step-by-step process for generating the correct payload for customers using their VCD infrastructure.

Version Support:

This API guide is applicable to clusters created by CSE 4.0 and CSE 4.0.1 Tanzu Kubernetes Clusters.

The existing prerequisites for customers to create TKG clusters in their organizations also apply to the automation flow. These prerequisites are summarized here and can be found in the official documentation to onboard Provider and Tenant Admin users. The following sections provide an overview of the requirements for both cloud provider administrators and Tenant Admin users.

Cloud Provider Admin Steps

The Steps to onboard the customers is demonstrated in this video and documented here. Once customer organization and its users are onboarded, they can use next section to use APIs, or consume it to create automated Cluster operations.

As a quick summary following steps are expected to be performed by cloud provider to onboard and prepare the customer:

  1. Review Interoperability Matrix to support Container Service Extension 4.0 and 4.0.1
  2. Allow necessary communication for CSE server
  3. Start CSE server and Onboard customer organization (Reference Demo and Official Documentation)

Customer Org Admin Steps

When the cloud provider has onboarded the customer onto the Container Service Extension, the organization administrator must create and assign users with the capability to create and manage TKG clusters for the customer organization. This documentation outlines the procedure for creating a user with the “Kubernetes cluster author” role within the tenant organization.

It is then assumed that the user “acmekco” has obtained the necessary resources and access within the customer organization to execute Kubernetes cluster operations.

Generate ‘capiyaml’ payload

  • Collect VCD Infrastructure and Kubernetes Cluster details

This Operation requires following information for VCD tenant portal. The right column describes example values used as reference in this blog post.

InputExample value for this blog
VCD_SITEVCD Address (https://vcd-01a.local)
VCD_ORGANIZATIONCustomer Organization name(ACME)
VCD_ORGANIZATION_VDCCustomer OVDC name (ACME_VDC_T)
VCD_ORGANIZATION_VDC_NETWORKNetwork name in customer org (172.16.2.0)
VCD_CATALOGCSE shared catalog name (cse)
Table -1 Infrastructure details

InputExample value for this blog
VCD_TEMPLATE_NAME Kubernetes and TKG version of the cluster(Ubuntu 20.04 and Kubernetes v1.22.9+vmware.1)
VCD_CONTROL_PLANE_SIZING_POLICYSizing policy of control plane vms(TKG small)
VCD_CONTROL_PLANE_STORAGE_PROFILEStorage profile for control plane of the cluster (Capacity)
VCD_CONTROL_PLANE_PLACEMENT_POLICYOptional – Leave empty if not using
VCD_WORKER_SIZING_POLICYSizing policy of worker nodes vms(TKG small)
VCD_WORKER_PLACEMENT_POLICYOptional – Leave empty if not using
VCD_WORKER_STORAGE_PROFILEStorage profile for control plane of the cluster (Capacity)
CONTROL_PLANE_MACHINE_COUNT1
WORKER_MACHINE_COUNT1
VCD_REFRESH_TOKEN_B64“MHB1d0tXSllVb2twU2tGRjExNllCNGZnVWZqTm5UZ2U=”
Ref VMware Doc to Generate token before transforming it to Base64
Table 2- Kubernetes Cluster properties
  • Install required tools to generate the capiyaml. User can use any Operating System or a Virtual Machine(including Linux, Mac or Windows) to generate the payload.
  • Once the tenant user has collected all the information, user will have to install following components such as  Clusterctl 1.1.3, Kind(0.17.0), and Docker (20.10.21) on end user’s machine. The following step requires above collected information, and not the access to VCD Infrastructure to generate capiyaml payload.
  • Copy TKG CRS Files locally. Incase the TKG version is missing from the folder, make sure you have the templates created for the desired TKG versions. The Following table provides supported list of etc, coredns, tkg, tkr versions for CSE 40 and CSE 4.0.1 release. Alternatively this script to fetch the same values from Tanzu Kubernetes Grid resources.
Kubernetes VersionEtcd ImageTagCoreDNS ImageTagComplete Unique VersionOVATKG Product VersionTKr version
v1.22.9+vmware.1v3.5.4_vmware.2v1.8.4_vmware.9v1.22.9+vmware.1-tkg.1ubuntu-2004-kube-v1.22.9+vmware.1-tkg.1-2182cbabee08edf480ee9bc5866d6933.ova1.5.4v1.22.9—vmware.1-tkg.1
v1.21.11+vmware.1v3.4.13_vmware.27v1.8.0_vmware.13v1.21.11+vmware.1-tkg.2ubuntu-2004-kube-v1.21.11+vmware.1-tkg.2-d788dbbb335710c0a0d1a28670057896.ova1.5.4v1.21.11—vmware.1-tkg.3
v1.20.15+vmware.1v3.4.13_vmware.23v1.7.0_vmware.15v1.20.15+vmware.1-tkg.2ubuntu-2004-kube-v1.20.15+vmware.1-tkg.2-839faf7d1fa7fa356be22b72170ce1a8.ova1.5.4v1.20.15—vmware.1-tkg.2
Table 3 – Kubernetes, Etcd, Coredns for relavant Tanzu Kubernetes versions for CSE 4.0, 4.0.1
  • Copy the ~/infrastructure-vcd/v1.0.0/clusterctl.yaml to ~/.cluster-api/clusterctl.yaml.
  • The ‘clusterctl‘ command uses clusterctl.yaml from ~/.cluster-api/clusterctl.yaml to create the capiyaml payload. Update the infrastructure details from the first step in this document.
  • Update the providers.url in ~/.cluster-api/clusterctl.yaml to ~/infrastructure-vcd/v1.0.0/infrastructure-components.yaml.
  • At this point your ~/cluster-api/clusterctl.yaml values should look as follows

At this point, we will need a kind cluster to install clusterctl to generate the payload. In this step, create Kind cluster to generate capiyaml payload and initialize clusterctl as follows:

Update the below tkg labels to “Kind: Cluster” object and annotations.

  • At this point, the capiyaml is ready to be consumed by VCD APIs to perform various operations. For verification, make sure cluster name, namespace values are consistent. Copy the content of capiyaml to generate jsonstring using similar tool as here.

Cluster Operations

Following section describes all supported API operations for Tanzu Kubernetes Cluster on VMware Cloud Director:

List Clusters

List all clusters in the customer organization. for CSE 4.0 release the CAPVCD version is 1.

Info Cluster

Filter Cluster by name

Get cluster by ID:

Get Kubeconfig of the cluster:

The Kubeconfig can be found as follows at: entity.status.capvcd.private.kubeconfig

Create a new Cluster

Resize a Cluster

  • Fetch the Cluster ID("id": "urn:vcloud:entity:vmware:capvcdCluster:<ID>) from the above API call’s output.
  • Copy the complete output of the API response.
  • Notedown eTag Value from API response header
  • Modify “capiyaml” with following values:
    • To resize Control Plane VMs Modify kubeadmcontrolplane.spec.replicas with desired number of control plane vms. Note only odd numbers of control plane are supported.
    • To resize Worker Plane VMS Modify MachineDeployment.spec.replicas with desired number of worker plane VMs
  • While performing the PUT API call, ensure to include fetched eTag value as If-Match

Upgrade a Cluster

To Upgrade a cluster, Provider admin needs to publish desired the Tanzu Kubernetes templates to the customer organization in catalog used by Container Service Extension.

collect the GET API response for the cluster to be upgraded as follows:

  • Fetch the Cluster ID("id": "urn:vcloud:entity:vmware:capvcdCluster:<ID>) from the above API call’s output.
  • Copy the complete output of the API response.
  • Notedown eTag Value from API response header
  • The customer user performing cluster upgrade will require access to Table 3 information. Modify Following values matching the target TKG version. The Following table shows Upgrade for TKG version 1.5.4 from v1.20.15+vmware.1 to v1.22.9+vmware.1
Control Plane VersionOld ValuesNew Values
VCDMachineTemplate
VCDMachineTemplate.spec.template.spec.templateUbuntu 20.04 and Kubernetes v1.20.15+vmware.1Ubuntu 20.04 and Kubernetes v1.22.9+vmware.1
KubeadmControlPlane
KubeadmControlPlane.spec.versionv1.20.15+vmware.1v1.22.9+vmware.1
KubeadmControlPlane.spec.kubeadmConfigSpec.dnsimageTag: v1.7.0_vmware.15v1.8.4_vmware.9
KubeadmControlPlane.spec.kubeadmConfigSpec.etcdv3.4.13_vmware.23v3.5.4_vmware.2
KubeadmControlPlane.spec.kubeadmConfigSpec.imageRepositoryimageRepository: projects.registry.vmware.com/tkgimageRepository: projects.registry.vmware.com/tkg
Worker Node Version
VCDMachineTemplate
VCDMachineTemplate.spec.template.spec.templateUbuntu 20.04 and Kubernetes v1.20.15+vmware.1Ubuntu 20.04 and Kubernetes v1.22.9+vmware.1
VCDMachineTemplate.spec.template.spec
MachineDeployment
MachineDeployment.spec.versionv1.20.15+vmware.1v1.22.9+vmware.1
Table 4 – Example values to change in capiyaml payload for TKG 1.5.4 Kubernetes version 1.20.15 to 1.22.9 for CSE 4.0, 4.0.1
  • While performing the PUT API call, ensure to include fetched eTag value as If-Match

Delete a Cluster

  • Fetch the Cluster ID("id": "urn:vcloud:entity:vmware:capvcdCluster:<ID>) from the above API call’s output.
  • Copy the complete output of the API response.
  • Notedown eTag Value from API response header
  • Add or modify the following fields to delete or forcefully delete the cluster under entity.spec.vcdke:
    • “markForDelete”: true, –> Set the value to true to delete the cluster
    • “forceDelete”: true, –> Set this value to true for Forceful deletion of a cluster

Recommendation for API Usage during automation

  • DO NOT hardcode API urls with RDE versions. ALWAYS parameterize RDE versions. For example:

POST https://{{vcd}}/cloudapi/1.0.0/entityTypes/urn:vcloud:type:vmware:capvcdCluster:1.1.0 Ensure to declare 1.1.0 as a variable. This will ensure easy API client upgrades to future versions of CSE.

  • Ensure the API client code ignores any unknown/additional properties while unmarshaling the API response

Summary

To summarize, we looked at CRUD operations for a Tanzu Kubernetes clusters on VMware Cloud Director platform using VMware Cloud Director supported APIs. Please feel free to checkout other resources for Container Service Extension as follows:

  1. Generate API token using VMware Cloud Director
  2. CSE 4.0 Official Documentation
  3. Cluster API for VMware Cloud Director Platform official Documentation