Best Practices devops DevOps Best Practices How-tos kubernetes Platform Engineering Best Practices Tanzu Kubernetes Grid tutorials vSphere

Upgrading Kubernetes the Easy Way with Tanzu Kubernetes Grid Service for vSphere

Lifecycle management is one of the most complicated components of Kubernetes. In a past article, we showed how to modify a cluster to change the type and size of its nodes. In this post, we will explain how to upgrade the Kubernetes version of a Tanzu Kubernetes cluster when there is a new release. And the best part is that it’s all done in just a few easy steps using the Tanzu Kubernetes Grid Service for vSphere. 

A Content Library is associated with a vSphere Namespace. The official Kubernetes images for TKG Service are available as a public Content Delivery Network (CDN) and are configured easily within vCenter (documentation). After configuration, the syncing progress begins and the official Kubernetes images will be downloaded into your environment and ready for use. 

 

For this demonstration, I have already deployed a Tanzu Kubernetes cluster that has three control plane nodes and three worker nodes.

Within the documentation, there are a few ways to upgrade a Kubernetes cluster. The most common and preferred way is to change the distribution version in the cluster spec. This can be performed using kubectl patch, editing the original cluster spec YAML, or by using kubectl edit to grab the current state of the running cluster. It’s a recommended practice to fetch the current state before making any adjustments to the cluster.

A critical part of the upgrade is verifying that the line fullVersion: null has been added to .spec.distribution.fullVersion when specifying a version number. This is so the version number can be used in the cluster spec versus the entire string of the image in the content library. This will also help avoid a mismatch during discovery.

Apply the configuration change to the Supervisor Cluster and the declarative nature of Cluster API will perform the upgrade. At this point, we’re finished and the upgrade will continue and finish on its own.

This upgrade process will happen as a rolling upgrade. The Kubernetes versions on the nodes are never upgraded, but instead, new virtual machines with the new version will take the place of the running nodes.

The control plane nodes will be upgraded first. A new virtual machine is created in vSphere from the subscribed content library. Once this virtual machine has been configured and added to the Kubernetes cluster, one of the control plane nodes in the cluster with the older version will be tainted with the status of “SchedulingDisabled” and then removed from the Kubernetes cluster. After it’s removed from Kubernetes, the virtual machine will be deleted from vSphere to complete the lifecycle. This process repeats itself until all control plane nodes have been upgraded.

 
After the control plane has been upgraded, the worker nodes are next. As before, a new virtual machine is created in vSphere and added to the Kubernetes cluster. Before a worker node is removed from a Kubernetes cluster in a rolling upgrade, it gets tainted so no new workloads can be deployed to it. After the taint has been added, the worker node is then flushed. Any containers running will be stopped and exited. Kubernetes will be responsible for redeploying the containers to other nodes in the cluster to achieve the desired state for the application deployment. Once the node has been flushed, it’s removed from the Kubernetes cluster and then the virtual machine is deleted from vSphere. This process again repeats itself until all worker nodes have been upgraded.
 

Interested to see it happen on video? Watch it here:

 
For more information, check out the Tanzu Kubernetes Grid site.