Today’s Landscape
As Service Providers continue to enhance their networks with vRAN and unveil new 5G-based revenue sources for their customers, they increasingly require agility, scale, security, and resiliency of cloud-native network functions running as containers in Kubernetes clusters. What’s more, to support these container-based network functions (CNFs) for the RAN, Service Providers must deploy thousands of RAN sites that then in turn use thousands of clusters. To handle the intensity and sheer quantity of these workloads, Service Providers require the speed and reliability of automated cluster operations—especially for Day 2 lifecycle management (LCM) (i.e., change management and ongoing maintenance). Also, as new Kubernetes versions are released, it becomes increasingly important for Service Providers to upgrade their clusters accordingly to avoid unnecessary network complications.
One essential aspect behind effective LCM automation is the ability to seamlessly upgrade Kubernetes clusters from one version to the next. Automating one’s cluster upgrades has numerous benefits: realizing the consistent benefits of new Kubernetes capabilities, maintaining the latest security measures, scaling efficiently, and keeping pace with the quickly-evolving cloud-native industry.
As such, as part of VMware Telco Cloud Automation’s Container-as-a-Service (CaaS) Automation, VMware developed automated upgrades of Kubernetes management and workload clusters—improving overall CaaS operability, usability and paving a clear path to 5G-related services for Service Providers.
Feature Overview & Benefits
Cluster upgrades are integral to effective Day 2 LCM operations and crucial to supporting next-gen network services. For the Service Provider, Telco Cloud Automation ensures that clusters are upgraded on an automated, rolling basis that preserves Kubernetes node customization,[1] network function and network service workloads during the upgrade process. By contrast, if a Service Provider were to manually upgrade clusters, then any node customizations will be lost during the upgrade process and the operator will need to reconduct node customization for each cluster—creating tedious, costly and error-prone tasks, almost impossible in a large-scale deployment. In other words, manually enacting cluster upgrades inhibits a Service Provider’s ability to scale workloads and impedes operability.
Telco Cloud Automation, through its centralized, single-pane-of-glass management supports upgrades to management and workload clusters, as depicted in Figures 1 and 2 below. Telco Cloud Automation also supports cluster upgrade services (i.e., add-ons, operators, etc.) individually—decoupling the services from the management cluster upgrade. In other words, Service Providers can upgrade add-ons for their Kubernetes clusters on demand without the need to upgrade the entire management cluster and, therefore, network workloads do not need to incur downtime. To improve cluster scale and performance, Telco Cloud Automation supports multiple parallel lifecycle operations on clusters and, if Kubernetes displays a deprecated component, Telco Cloud Automation will automate its migration during the upgrade. Telco Cloud Automation can also enable “Machine Health Check” support through VMware’s Tanzu Kubernetes Grid (TKG) that monitors node pools for unhealthy nodes and remediates them through node pool recreation.
In short, Telco Cloud Automation provides holistic support and automation for cluster upgrades. The benefits of automated cluster upgrades are clear—Telco Cloud Automation’s centralized platform enables Service Providers to optimize their cloud-native network functions and services through industry-leading CaaS automation. The modern network today—one grounded in cloud-native network functions and services—requires seamless automation and continuous cluster upgrades: Telco Cloud Automation provides both, as it imbues superior operability, ease and speed in an otherwise complicated, time-consuming process.
[1] The benefits of automated upgrades in the case of cluster customization preservation should not be understated. As presented in this article on TCA’s “late-binding” capability, node customization critically allows Service Providers to tailor their Kubernetes nodes and infrastructure to address specific workload requirements and tackle customer-specific SLAs.