Co-Authored by VMware and Nvidia
Partnership Will Enable AI and Deep Learning Approaches to Optimize Spectral Efficiency and Network Slicing
The time to discover COVID-19 vaccine is a testament to the pace of innovation in the healthcare industry. Pace of innovation can be directly linked to the thriving ecosystem of innovators and the very large number of AI-based startups in the healthcare sector. By way of contrast, the 5G / Wireless industry takes approximately a decade to introduce next generation systems. One way of addressing the pace of innovation and post-deployment feature enhancements is being pioneered by the O-RAN Alliance. The traditional model of blackbox design, closed and proprietary interfaces and limited options for an ecosystem to introduce new capabilities into deployed equipment is being disrupted with a move to a whitebox paradigm, open and standardized interfaces, and concepts such as the RAN Intelligent Controller (RIC), a key technology that will enable 3rd parties to add new capabilities to the network and provide monetization opportunities for not only the developer ecosystem but also network operators.
The Future of Wireless
Softwarization, virtualization and disaggregation are some of the foundational concepts of 5G and beyond communication networks. Softwarization of the RAN, and its realization using a software defined radio (SDR) paradigm, is critical for supporting the 3 key use-cases that are the hallmark of 5G: Enhanced Mobile Broadband (eMBB), Ultra Reliable Low-Latency Communication (URLLC) and Massive Machine Type Communication (mMTC). The capability, through software, to dynamically bring-up and tear-down network slices composed of eMBB, URLLC and mMTC flows is a key differentiator between 4G and 5G, and indeed a core value proposition of 5G. Virtualization is an enabler for the efficient sharing of hardware and software assets in support of heterogeneous workloads in Mobile Edge Computing (MEC). Disaggregation represents the dawn of a new eco-system for the wireless industry and opens the door to new business opportunities for a broad spectrum of ISVs and a new generation of hardware developers.
The disaggregation of traditional, monolithic, blackbox-style wireless infrastructure equipment into the logical entities of the Centralized Unit (CU), Distributed Unit (DU) and the Radio Unit (RU) enables traditional network and emerging private 5G network operators the flexibility to tailor a system architecture to meet their operational and business needs.
An equally important component of this new approach to wireless networking infrastructure is the standardization of interfaces, both physical and logical between the hardware and software subsystems. Together with the development of an open software stack, these capabilities not only enable the rapid deployment of new network features via software, they also enable a new generation of software ecosystem developers to write application code for deployment in a network. Applications, that by the virtue of these standardized interfaces and APIs, facilitate the control and interaction with entities running in the CU, DU and RU.
The RAN Intelligent Controller (RIC)
The O-RAN Alliance is standardizing an open, intelligent and disaggregated RAN architecture. The objective is to enable the construction of an operator defined RAN using COTS hardware and provision for AI/ML-based intelligent control of 5G and future generation 6G wireless networks. A conventional RAN that is built using proprietary hardware, proprietary interfaces and proprietary software is replaced by a vRAN employing COTS hardware, open interfaces and with options to support both proprietary software and applications developed by the ecosystem.
One of the most important elements of the O-RAN standard is the RAN Intelligent Controller (RIC) shown in Figure 1. The RIC consists of two main components, the Non-real-time RIC (Non-RT RIC) and the Near real-time RIC (Near-RT RIC). The Non-RT RIC supports network functions at time scales of >1 second, in contrast to the Near-RT RIC which supports functions operating at time scales of 10 milliseconds – 1 second.
As part of the SMO framework, some of the responsibilities of the Non-RT RIC include machine learning model life-cycle management, machine learning model selection, and the marshalling, curation and preprocessing of data gathered from the CU, DU and even RU, in preparation for model training on the training host. The Near-RT RIC introduced in the O-RAN architecture brings software-defined intelligence to the system and includes advanced near-real-time analytics on data streamed from CU and DU, AI model inference and online retraining of machine learning models.
Together, the SMO, the Non-RT RIC and the Near-RT RIC bring machine learning techniques to all layers of the network architecture: layer-1 PHY, layer-2 and the network level itself via AI-based Self Organizing/Optimization Network (SON) capabilities.
Figure 1. O-RAN RIC architecture comprising of the Non-RT RIC, Near-RT RIC and the various interfaces between these software entities that permit the control and configuration and data extraction from the CU, DU and RU. Figure is from [1].
To help understand the RIC in more detail, consider an LTE (the approach will be similar for 5G NR) example employing RIC-enabled AI for cell capacity management by using a Long Short Term Memory (LSTM) traffic prediction model. The objective is to predict traffic for all cells in the network and mitigate future congestion [2]. A 2-layer LSTM network, employing 12 LSTM cells per layer, is trained using UE throughput measurements and PRB (Physical Resource Block) utilization from 17 LTE eNBs in a real-world fully operational wireless network. The inference operation predicts UE throughput and eNB downlink PRB utilization one hour into the future. Figure 2. shows the ground truth (actual) and predictions (LSTM inference) for throughput and PRB utilization for one cell of one eNB. The average prediction accuracy of 92.64% is remarkable. With the ability to forecast cell loadings at up to 1 hour into the future, the eNB can take steps, for example cell splitting, to avoid coverage outages.
Figure 2. User-perceived IP throughput and PRB utilization prediction for a cell of a selected eNB in the network. Figure is from [2].
The role of the SMO in this example is to gather data from the O-CU/DU via the O1 interface (refer Figure 1.) and deliver it to the non-RT RIC. A non-RT RIC rApp in turn queries the AI server associated with the SMO. The AI server will run a training process to update the LSTM model parameters based on fresh data collected from the operating network. GPUs are the natural choice for ML training from both a programming model and compute capability perspective. The training workload will be quite large due to the scale of the wireless network. We are not interested in training a model for a single eNB with a handful of cells, but for a system that could have 100’s to 1000’s of base stations, with many 1000’s of cells and 1000’s to 10,000’s of UEs. Having a GPU-powered AI training server provides the option of sharing that infrastructure over many SMO hosts and so is going to be more cost and power efficient than a CPU AI training host. In other words, there are both CAPEX and OPEX advantages for the network operator.
After the training server has updated the LSTM model the updated model parameters are returned to the non-RT RIC rApp and the throughput/PRB prediction process continues with the updated model. Figure 3. illustrates the throughput gains. The vertical axis shows the fraction of the number of operating hours that user throughput for each band indicated on the horizontal axis. For example, we can see that without cell splitting throughput is in the range of 5-7.5 Mbps for approximately 1% of the time, but with predictive cell splitting throughput is in this same range for approximately 10%, a difference of a factor of 10.
Figure 3. User throughput for different cell splitting configurations. Figure is from [2].
An xApp Nvidia is researching is to enable intelligent and predictive multi-cell joint resource management, which has the potential to significantly improve energy efficiency of the network. AI algorithms running at non-RT RIC can predict the user density and traffic load in each cell within a prediction window (on a seconds-to-minutes time scale) based on the traffic history provided by CU/DUs. Then each DU scheduler makes decisions to switch OFF certain cells with low predicted traffic load to reduce energy consumption, and triggers coordinated multi-point transmission/reception (CoMP) from neighboring active cells to ensure effective coverage.
The near-RT RIC can help achieve the efficient multiplexing of eMBB and URLLC data traffic on the same frequency band. Due to the significantly diverse service requirements, eMBB and URLLC transmissions are scheduled on two different time scales, i.e., on time slot and mini-slot levels for eMBB and URLLC, respectively. An AI-based xApp at the near-RT RIC could learn and predict the URLLC packet arrival patterns based on the traffic statistics streamed from the DU over the E2 interface (refer Figure 1.). Such predictive knowledge is utilized at the DU scheduler to optimize the resource reservation for URLLC mini-slots on top of eMBB data flows and minimize the loss of eMBB throughput caused by such multiplexing.
One could also envision an xApp for massive MIMO beamforming optimization to maximize spectral efficiency. In this case the Non-RT RIC hosts an rApp to perform long-term data analytics and whose task is to collect and analyze antenna array parameters and continually update a machine learning model. The Near-RT RIC xApp is implementing ML inference to configure, for example, beam horizontal and vertical aperture and cell shape.
Why GPUs
The signal processing requirements (MACs/second) of the 5G NR physical layer are immense. The massive parallelism of the GPU brings the hardware resources to bear that can support this class of workload. In fact, a single GPU can support the baseband processing requirements of many 10’s of carriers. And while specialized hardware accelerators would typically have been employed in previous generation systems, the parallel nature of the GPU is what enables the softwarization of the RAN, by essentially providing a C++ abstraction for programming advanced signal processing algorithms.
However, the value of the GPU extends beyond vRAN signal processing. In a 5G and 6G systems where big-data-meets-wireless, where AI/ML is used to improve network performance, GPUs are the defacto standard for model training and inference. A common GPU-based hardware platform can support the tasks of training, inference and signal processing. But it’s not only about GPU hardware, an equally important consideration is the software for programming GPUs and SDKs and libraries for application development. GPUs are programmed using CUDA, the world’s only commercially successful C/C++ based parallel programming framework. There is also a rich set of GPU libraries for developing, for example, data analytics pipelines using the Nvidia RAPIDS [3] software suite. The data analytic pipeline could be one of the services that the SMO/Non-RT RIC engage to update and fine-tune inference models running under the Near-RT RIC.
VMware and Nvidia Partnership
In late 2020, VMware released the world’s first O-RAN standard compliant Near-RT RIC in its partner lab for integration and testing with select RAN and xApp vendor partners. To facilitate development of xApps on its Near-RT RIC, VMware provides its xApp partners with a set of developer resources packaged as an SDK. Today, VMware and Nvidia are excited to announce that the Near-RT RIC SDK now enables xApp developers to leverage GPU acceleration in their applications. This is an exciting milestone for the industry as it opens the doors for the larger industry ecosystem to build AI/ML-powered capabilities for modern RANs, including those based on Nvidia’s Aerial gNB stack. Eventually, the combination of VMware RIC and Nvidia Aerial stack will be the enabler for the ecosystem to develop and importantly monetize new and innovating xApps that enhance or expand the capabilities of a deployed network.
Conclusion
Openness and intelligence are the two core pillars of the O-RAN initiatives. As the 5G rollout and the ramp of 6G research continues, intelligence will be all encompassing for the deployment, optimization and operation of wireless networks.
Transitioning from the traditional black box approach historically employed in cellular networks to a white box model will open the door to a new era of swift innovation and time-to-market of new RAN features. Nvidia vRAN (Aerial) and AI technology, combined with the VMware RIC will foster a new generation of wireless ecosystem and open up new monetization and innovation opportunities.
VMware wants to thank our colleagues from Nvidia for their contributions to this blog post. Special thanks go to Chris Dick (Head for AI on 5G, Nvidia), Vikram Aditya (Head of 5G Aerial, Nvidia), and Soma Velayutham (GM of Telco/5G Vertical, Nvidia).
[1] L. Bonati, M. Polese, S. D’Oro, S. Basagni, T. Melodia, “Open, Programmable, and Virtualized 5G Networks: State-of-the-Art and the Road Ahead,” Computer Networks vol. 182, December 2020.
[2] S. Niknam, A. Roy, S. Dhillon, S. Singh, R. Banerji, J. H. Reed, N. Saxena, S. Yoon, “Intelligent O-RAN for Beyond 5G and 6G Wireless Networks,” https://arxiv.org/pdf/2005.08374.pdf, May 2020.
[3] Nvidia RAPIDs https://developer.nvidia.com/rapids