Streamline, Simplify and Protect all your AI workloads with VCF 9.1

5-12-26 Update – VCF 9.1 is now Generally Available!

AI has tremendous potential to transform all enterprises.

IDC Predicts AI Solutions & Services will Generate Global Impact of $22.3 Trillion by 2030¹.

With such massive potential, it’s no surprise that enterprises are eager to leverage AI to boost productivity across every aspect of their business. However, they require a comprehensive strategy to fast-track AI integration across their data center infrastructure.

With VMware Cloud Foundation Private AI Services, Broadcom is on a mission to help enterprises unlock AI and unleash productivity with lower TCO for all enterprises.

Real-World Impact: What Our Customers Are Saying

Enterprises across industries are already deploying VCF Private AI Services and unlocking the cost savings, privacy, and security for their AI workloads:

“By implementing VCF Private AI Services, we have strengthened our intelligent service capabilities,” says Tung-Liang Chen, Vice President, Chunghwa Post. “ Running AI on our own private cloud infrastructure with VCF, allows us to significantly reduce costs and improve real-time automated detection efficiency, while ensuring seamless integration with our existing systems.”

“Analyzing years of news archives in the public cloud is cost-prohibitive, with unpredictable pricing that makes AI projects difficult to plan,” said V V Jacob, Senior General Manager, Systems for Malayala Manorama Co Ltd. “By deploying VCF Private AI Services on our existing VMware Cloud Foundation infrastructure, we will run AI-powered content summarization, heading generation, and editorial assistance directly on our private cloud. We believe this will give us the privacy and security essential for protecting editorial sources while delivering the cost predictability that on-premises private cloud infrastructure provides.”

Announcement

Today we are announcing the next release of VCF Private AI Services with VCF 9.1. With this new release, we are adding several exciting capabilities for enterprises.

New Capabilities Supported now

1. Enable Privacy and Security

Broadcom is helping enterprises build and deploy private and secure AI models with integrated security capabilities provided through VCF Private Services. Let’s examine a new capability that we are releasing to improve privacy and security.

Model Context Protocol (MCP) Support with Governance: With MCP support, we provide enterprises with a secure and standardized method to integrate AI assistants with internal content repositories and external MCP tools from Oracle, Microsoft SQL Server, ServiceNow, GitHub, Slack, PostgreSQL, and more—without building and maintaining custom connectors.

2. Simplify Infrastructure Management

Support for Google Documents: VCF Private AI Services now delivers comprehensive support for Google Workspace—including Google Docs, Sheets, and Slides, without having to export the documents into PDF and upload them to the knowledge base. With the support for Google Workspace in addition to the existing support for Microsoft Word, Microsoft PowerPoint, PDF, CSV, and more, enterprises now have access to an extremely diverse set of document types and achieve high-quality results for their AI workloads.

DirectPath Enablement for GPUs: With this release, VCF Private AI Services now supports DirectPath Enablement for NVIDIA AI infrastructure. This will enable high-performance, exclusive GPU access for a single VM and the VM will be able to fully utilize GPU capabilities. With this new capability, enterprises can deploy AI projects using NVIDIA GPUs in DirectPath mode.
Support for the latest generation NVIDIA Blackwell GPUs:
- VCF now supports the latest NVIDIA Blackwell series GPUs- In addition to our existing support for NVIDIA RTX PRO 6000 Blackwell Server Edition, we are happy to announce that VCF will also support NVIDIA HGX B200 and NVIDIA RTX PRO 4500 Blackwell Server Edition.The support for these latest Blackwell GPUs on VCF defines the next chapter for enterprises in AI with unparalleled performance, efficiency, and scale.
- Future support: In a future release, VCF will support NVIDIA HGX™ B300. VCF on NVIDIA HGX B300 will effortlessly scale enterprises’ highest-performance AI workloads for future-proofing.
Support for NVIDIA HGX Platform with Blackwell GPUs and NVLink Switch: VCF now supports the NVIDIA HGX™ platform with Blackwell GPUs and NVLink Switch. With this capability, enterprises can now get all the benefits of massive-scale AI deployments with VCF Private AI Services and NVIDIA HGX platform. NVIDIA HGX platform brings together the full power of NVIDIA AI infrastructure, including NVIDIA GPUs, NVIDIA NVLink™, NVLink Switch, NVIDIA networking, and fully optimized AI software stacks to provide the highest AI application performance and drive the fastest time to insights for every data center.
High-Speed Networking with Enhanced DirectPath I/O: VCF now supports NVIDIA ConnectX-7 NICs and NVIDIA BlueField-3 with Enhanced DirectPath I/O. With this enhancement, enterprises can leverage advanced capabilities like like NVIDIA GPUDirect RDMA and GPUDirect Storage for high-speed, multi-host AI model training and data transfer, crucial for demanding Gen AI workloads.

3. Streamline Model Deployment

The new capabilities in this category help enterprises with reducing the complexity of taking models to production.

AI Metrics Observability Dashboard

As enterprise AI environments scale, limited visibility into model & agent performance and cost drivers prevent teams from identifying inefficiencies and lead to higher infrastructure spend and suboptimal application performance. To address these challenges, we are releasing the AI Metrics Observability Dashboard that will show important AI metrics. This enhanced visibility into AI metrics will enable data scientists and MLOps to identify bottlenecks, optimize resource allocation, and improve throughput and performance.

Let’s explore some of the AI metrics that we will expose:

–Model metrics – These metrics will help enterprises with monitoring productivity, speed, latency, and more, providing enterprises with detailed insight into models. We are making available metrics like Cache Utilization, Tokens generated per request, Token throughput, Time to first token (TFFT), End-to-end (E2E) request latency and more.

– GPU Utilization metrics – We have also made available GPU metrics like Utilization, Temperature, Power Usage, Memory Temperature, Memory Clock and more.

Note- These AI Metrics dashboards require enterprises to deploy Grafana.

CPU-Based Inferencing: VCF Private AI Services now support CPU based inferencing, in addition to GPU-based deployments, through integration of Model Runtime with Llama.cpp inferencing engine. Powered by Llama.cpp—a leading open-source inference engine with extensive community support—customers will also gain access to a wide range of models with day-zero support from top providers like Google, OpenAI, and more. This enhancement reduces TCO by enabling enterprises to deploy less resource-intensive environments for testing, proof-of-concept initiatives, or AI applications with minimal or no GPU requirements.

Want to know more?

Complete this form to contact us!
Visit VMware.com/AIML for more information.
Connect with us on Twitter at @VMwareVCF and on LinkedIn at VMware VCF.

1- “IDC Blog, “IDC Predicts AI Solutions & Services will Generate Global Impact of $22.3 Trillion by 2030″ (01 April 2025)”.

Discover more from VMware Cloud Foundation (VCF) Blog

Subscribe to get the latest posts sent to your email.

Real-World Impact: What Our Customers Are Saying

Announcement

New Capabilities Supported now

1. Enable Privacy and Security

2. Simplify Infrastructure Management

3. Streamline Model Deployment

Want to know more?

Discover more from VMware Cloud Foundation (VCF) Blog

Related Articles

VCF Breakroom Chats Episode 89 – Unpacking the Latest AI Advancements in the VCF 9.1 Release, Part 2

The Year the Sovereign Cloud Debate Got Specific

How to Upgrade to VMware Cloud Foundation 9.1