While Communications Service Providers (CSPs), also known as Telcos, are well on their way to deploying virtualization and cloud-native systems, we are just beginning to see the benefits of the change and the potential for more. Immediate gains come from the transition to the new, more flexible software systems; the larger ones will be realized when the engineering methods evolve past the rigid structures of the past. This evolution starts with architecture and design for 5G.
Over the next few weeks, we will explore some of the best practices in design and architecture of modern Telco solutions and provide answers to some of the challenges related to people, processes, and technology. We want to provide you with best practices and raise awareness of challenges for new deployments. The insights that we will share come from our own experiences helping our customers succeed in getting the best out of modern technologies. We are excited to help everyone continue to move forward and improve.
Change is Everywhere
Telcos have a long tradition of thoughtful planning, with processes following a waterfall model for good reasons.
- With millions of customers, Telcos need solutions at a large scale. Each customer’s needs are unique, and any changes made to the network have to be thought out and planned to address the needs of each one.
- Telcos are regulated and it is important that the target architecture meets all regulatory compliance criteria. This includes the intermediate steps during the transition.
- Telcos are distributed and the same solution needs to be reproduced in thousands of locations. The rollout may take months, and it must follow a pre-defined process with an optimized solution before the rollout.
Despite efforts to anticipate each step of a rollout, reality does not always follow our plans. Even in a CSP environment, change should be expected. When engaging in a design project, the requirements are rarely set in stone. Sometimes stakeholders do not clearly know what the requirements are, and sometimes the project we engage on is part of a bigger project, which is also changing. Changes happen too in organizations that follow an agile process. At every sprint, the agile team may introduce changes.
Plans are needed, but change should be expected. The plan you create must include all technological, business, and regulatory requirements. The challenge occurs when these requirements shift. It would be unreasonable to go back to step one and start over, therefore, we recommend having built-in flexibility to keep things moving. Let’s dig into a recent project where we were able to partner with our customers to help them innovate and design a robust solution to their problems.
Customer Story: A Moving Target
The O-RAN deployment project for one of our customers is a notable example of a complex project where we had to change and adapt. The project scope was to deploy 5G-Core and O-RAN with 1000s of cell sites. There were multiple requirements, and it was not clear which ones would be most challenging. We had to do a cloud-native deployment, we had to have automation, we had to iterate fast, and we had to be able to scale.
Not only were the requirements hard to grasp, but they also changed over time. Each phase had its own challenges that we helped our customer overcome. Initially, the concern was certifying all network functions on the VMware Telco Cloud Platform stack and automating the onboarding for various variants of Distributed Units (cell sites). A related concern was defining the network architecture at scale. Collectively, we had to decide what the platform components would be, how to deploy them, and eventually how to automate that deployment with a CI/CD pipeline. Automation was crucial because this pipeline would later be used for deployment and lifecycle operations.
We needed the ability to test the required concurrency with a small number of requests before it was launched at full scale. All details and variants needed to be planned for, and this proved difficult. In production, there are challenges within the details. For example, there were many integration points to address and numerous configurations to prepare. We spent time creating payload for Network Functions Lifecycle Management (LCM) using a Configuration Management Database (CMDB); we thought through all variations and details of networking; we analyzed all possible flows and listed firewall rules, etc. During the development of the pipeline, there were upgrades to the solution components and the R&D team was continuously enhancing the product features. Each enhancement triggered changes in design and changes in the way we needed to automate.
Some changes in requirements were related to phases in the project; some were related to things we learned as we iterated through those phases. For example, the DevOps team started building a CI/CD pipeline for Containers as a Service (CaaS) infrastructure and Cloud Native network Functions (CNF) onboarding. Initially, the customer wanted to have a single pipeline which would perform multiple functions: register hosts into the Virtualized Infrastructure Manager (VIM), create node pools, and onboard CNF. But seeing that this objective had significant challenges, the customer shifted and proposed that each operation get a separate pipeline.
Other requirements changed as the strategic business direction shifted. For instance, halfway through the development cycle, the customer entered into an agreement with Amazon Web Services (AWS) and decided to use that infrastructure to host as many workloads as possible. This decision might have had fundamental impacts on the project, had we not already considered the possibility of adopting VMware Cloud (VMC) on AWS.
The impacts of this shift in strategy were profound. Initially, we thought the platform would host all the Radio Access Networks (RAN) and 5G core on-prem.
Later it shifted. Only the far edge would be on-prem while Core and some components of RAN like Centralized Unit (CU) would be on AWS.
Those were major changes, and you might think that each of these requirement changes might have caused our customer to go back to the drawing board, rework their plan, and change their adoption strategy. However, we were able to adapt with minimal changes. For instance: the original vault components could not be reused, and AWS required that we use their native AWS secret manager instead. It might have been a lot worse, but we found that:
- Network architecture and automation were unchanged with only minor adaptations
- Code could be reused and even changes in northbound APIs were straighforward
- Networking could be adapted by introducing a stretched cluster
So many changes, yet we successfully adapted the solution.
The modularity and openness of the platform allowed us to adapt to requirements that were different and more challenging than previously thought.
It required additional work, but the initial Telco Cloud Platform (TCP) design was flexible, modular, and open enough that much of the VMware components were maintained throughout those changes and simply extended using VMC on AWS and stretched clusters. This meant the initial design could be adapted, and a redesign was not necessary.
We Love When a Plan Comes Together
Correctly planning and adopting is what makes any deployment successful. Over the next few weeks, we will dive into some of the factors you need to consider when you are making your transition to 5G. You must face your challenges head on, evolving your design with the changing requirements. The great news is that you are not in this alone; VMware Professional Services is there to help you succeed. Can’t wait until the next blog comes out? Check out our Telco ebook on modernization.