This article was originally published in

How the AI Era is reimagining enterprise load balancing for app delivery, resilience, and security
The AI revolution demands a shift from hardware-defined to software- and AI-defined load balancing architectures that provide enhanced app resiliency, intelligent autoscaling, and self-healing capabilities coupled with predictive AI, GenAI and AI/LLM intelligence
Just as cloud computing led to the emergence of software-defined (SD) load balancing, the artificial intelligence revolution is further enhancing SD- to AI-defined architectures. This transformation represents a significant shift in how enterprises approach their infrastructure to support modern AI workloads and to bring AI benefits to existing workloads.
AI applications present significant challenges when load balancing for AI workloads. AI workloads, including agentic workloads, operate at extreme performance: terabits/second, not the gigabits/second that’s been required for traditional applications. As a result, organizations need load balancers with extraordinary throughput capabilities and support scale-out for elastic operations.
“When you build modern AI applications for enterprises, there has to be a very high level of performance, latency, resilience, security and elasticity,” said Chris Wolf, Global Head of AI and Advanced Services, VCF division, Broadcom. “Load balancers in the AI era must be able to manage services and fulfill enterprise requirements across multiple private AI environments.”
Additionally, enterprise AI applications are almost exclusively built on Kubernetes with a microservices architecture. That means organizations need load balancers that can auto-scale, auto-heal and operate “as code” with built-in capabilities including global server load balancing (GSLB), web-application firewalls (WAF) and API security.
AI applications exchange vast amounts of sensitive data through APIs, requiring robust protection against attacks and data leakage through comprehensive web app and API security. Thresholding with anomaly detection and traffic pattern recognition should be employed to optimize resource allocation.
AI-defined load balancing
It’s only fitting that load balancing in the AI era employs AI to get the job done, and it does so across three key dimensions.
First, predictive intelligence enables high resiliency by leveraging health score monitoring and dynamic thresholds that scale as needed in real time to accommodate bursts. In this environment, static thresholds aren’t feasible because traffic is too dynamic, and overprovisioning for max load would be prohibitively expensive. Active-active high availability configurations ensure continuous operation, while auto-scaling capabilities coupled with auto-healing recognize traffic patterns, and automatically remediate most issues without an admin getting deeply involved, if at all.
Second, generative AI can dramatically improve operational efficiency by acting as co-pilots to assist teams in a number of ways. Admins can ask questions using natural language and the AI tools provide answers, analytics and contextual insights based on information found in application health scores, application latency measurements, design guides, and knowledge base (KB) documentation. These tools can also provide correlated analytics, contextual insights and multi-factor inference within admins’ workstreams. Infrastructure-as-code capabilities reduce manual work because configurations can be changed programmatically in software. Capacity management and performance troubleshooting assistance can flag emerging issues for admins to address long before they affect users, all of which dramatically improves productivity.
Finally, AI-powered self-service capabilities create load balancing interfaces for DevOps teams that require zero training, because AI can provide intuitive assistance for engineers to follow. The result is faster deployment and configuration without sacrificing quality or security.
A solution that meets all of these AI Era requirements, such as Broadcom’s VMware Avi Load Balancer, delivers big dividends. Rigorous studies have shown that enterprise IT can achieve 43% OpEx savings, 90% faster app delivery provisioning, and a 27% DevOps productivity boost with the solution. While software-defined load balancing principles remain—ensuring scale-out performance, dynamic availability, and application-level security—the AI era dramatically amplifies these requirements while infusing AI principles. Organizations that embrace AI-defined load balancing will not only support their AI and non-AI workloads more effectively but will also benefit from the intelligence embedded within their infrastructure.
To learn more about how Broadcom can help your organization bring load balancing into the AI era, visit: https://www.vmware.com/products/cloud-infrastructure/avi-load-balancer