Authored By Paul Turner –Vice President Product Management at VMware
Today, few companies have the time (and in today’s economic environment, the money) to invest in modernization that won’t bring quick results. In short, companies are squeezed: they must modernize and need rapid returns. They can’t afford to get tangled up in the notorious complexity of setting up Kubernetes or artificial intelligence (AI) and machine learning (ML) environments.
Application teams need to move quickly, but traditional IT often requires ticket-based requests and can’t deliver infrastructure at the required pace. Teams additionally spend a significant amount of time on infrastructure operations related to networking and storage — taking away time from development. The result: an ever-increasing time to market for new applications. This adds up to modernization being in a hammerlock.
The Democratization of AI
Businesses require access to an automated platform that makes workloads simpler. The environment should work with AI, ML, and deep-learning applications for performing rich analytics. It has to be agile and should deliver the infrastructure services that AI training and inferencing workloads require.
VMware vSphere was redefined with vSphere 7 to use Kubernetes at the core and allows infrastructure to be provisioned as a desired state specification of application needs. This is particularly interesting for AI workloads where we need to provision specific GPU accelerators, servers and storage resources for training or inferencing model needs. With vSphere 7, the dynamic agility of the infrastructure removes all this complexity – it democratizes AI.
VMware and NVIDIA extended their partnership to deliver a powerful AI-Ready Enterprise platform. The newly available NVIDIA AI Enterprise software suite (read the announcement) is an end-to-end cloud-native suite of AI tools and frameworks optimized and certified by NVIDIA to run exclusively on vSphere. Together, VMware and NVIDIA deliver a turnkey solution to rapidly deploy, manage and scale AI workloads, facilitating the adoption of AI/ML in the enterprise. Data scientists can select certified enterprise-AI container based models from the NVIDIA AI Enterprise catalog and, when they deploy the model with Kubernetes, vSphere provisions the right resources (GPUs, compute, memory, storage) they require. To further ease deployments on-premises, leading system manufacturers are offering NVIDIA-Certified Systems optimized for AI workloads on vSphere with NVIDIA AI Enterprise.
Another key aspect to democratizing AI is to enable better shared use of accelerated computing hardware and increase its utilization – the power of virtualization. At the ground level, that means AI is no longer in a special-case category within IT. Historically, an administrator had to build a unique silo of hardware for the data scientist — a time suck for everyone involved. With vSphere, AI is now a part of a managed setup within IT.
Managing private silos of hardware will no longer burn up data scientists’ valuable (and expensive) time as they create their trained AI/ML models. Data scientists can be far more productive if they don’t have to manage their machines – or configure the required software onto them. Instead, everything just works.
You also don’t have to settle for a least common denominator feature set of the GPUs in your virtualized infrastructure. NVIDIA and VMware have worked closely to bring all the value of the AI-tuned hardware (such as the NVIDIA A100 and A30 GPUs) to virtualization. vSphere 7 uses vSphere Distributed Resource Scheduler to automatically place workloads on GPU-enhanced servers across the cluster. It provides optimal resource consumption and avoids performance bottlenecks.
vSphere now also supports NVIDIA Multi-Instance GPU (MIG) technology to allow for partitioning of GPUs, which further increases utilization while strictly separating the sharing VMs from each other on the GPU hardware. MIG is a new backing mechanism for the familiar NVIDIA vGPU profiles that you are used to working with on vSphere. It allows you to deliver a Quality of Service level to a consumer/VM of that share of the GPU. This can be particularly useful for inference in AI/ML, as well as for some more compact training jobs. vSphere can also live migrate vGPU powered VMs, which is particularly useful for infrastructure maintenance and non-disruptive operations during upgrades.
Democratizing AI and ML gives everyone a future-ready opportunity. AI and ML use cases include manufacturers that take advantage of data gathered from cameras. By deploying an ML model that is capable of processing images from cameras in the factories, these customers can address many of their operational challenges to support worker safety and deploy predictive maintenance to boost uptime.
Delivering an AI-Ready Enterprise Platform to Advance Digital Business
vSphere 7 is helping to mainstream AI in the enterprise. For businesses, AI and ML is now a real possibility, right away. The AI-Ready Enterprise platform from VMware and NVIDIA addresses the complexity associated with AI and ML initiatives — providing businesses with the confidence to modernize their infrastructure for AI and leverage AI to transform their business.
You can learn more about AI/ML at the upcoming VMworld 2021 (Oct. 5-7, 2021) and register for a free general pass at: http://www.vmworld.com. Featured AI/ML sessions at VMworld will include:
- Making AI Possible for Every Business [MCL3037S]
- Architect the Enterprise Data Center for AI with VMware and NVIDIA [VI1501]
- AI-Ready Enterprise Platform on vSphere with Tanzu [MCL2373]
- Expand the Impact of AI in Financial Services [VI2078]
- Virtualized AI and HPC in Action at the University of Pisa [VI2263]
- The Present and Future of Enterprise AI [VI3068S]
Learn More
- NVIDIA Product Page: NVIDIA AI Enterprise
- Joint Solution Brief: AI-Ready Enterprise Platform
- Alliance Page: VMware and NVIDIA