Your company’s data scientists, machine learning practitioners or developers have asked you to provide them with a GPU-capable machine setup to do their work. They want to be able to execute workloads that need GPU compute power. The data scientist describes the workloads as machine learning “training”, “inference” or “development”. We will explain what they mean by these terms later on in this series of articles. This opening article gives an overview of the various options open to you to provide the required infrastructure on VMware vSphere.
The reason your end-users need GPU capability is simply faster time to results. Machine learning models involve very large matrix multiplications and GPUs are designed to compute these operations much faster than CPUs.
Your company is probably already using virtual machines on vSphere for developers/testers and data management people, but a key question in your mind now is:
Can GPUs be used in vSphere for applications other than VDI?
The short answer is a resounding ‘yes’. We call this use case “GPU Compute” in vSphere. In its simplest form, VMware vSphere allows your end users to consume GPUs in VMs in the same way they do in any GPU-enabled public cloud instance or on bare metal. In addition, through collaboration with our technology partners, vSphere allows multiple flexible consumption and GPU utilization models that can increase the ROI on ownership of this infrastructure, while providing your end-users exactly what they need.
This article will help you navigate through the process of fulfilling that original end-user request. You’ll understand what to ask the end-users and your hardware and software vendors. It presents the various alternatives to you for consideration – as different implementations suit different scenarios of use.
What about performance?
Generally, a GPU within a vSphere virtual machine can deliver near bare-metal performance, though the exact performance is dependent on the technology used. We will touch upon the performance characteristics of each technology in the subsequent parts of this series. For an initial glance at some performance numbers see this post from VMware’s performance engineering team.
Different Methods of GPU Usage with Virtual Machines
One of your very early decision points as a system administrator is to decide how exactly the GPUs will be used in your environment. As mentioned, there are different ways of consuming GPUs through virtual machines. The approach you decide on for this will largely depend on the type of users and applications that will be making use of the GPUs for their applications. The options are shown in Table 1.
Table 1: GPU configurations and their respective use cases
The types of technology that apply to these three different situations are shown in the lower part of Figure 1.
Figure 1: A decision tree for different GPU use cases
As you can see, some use cases are enabled by third party VMware partner technology providers. Each technology comes with its own pros and cons and provides different levels of flexibility and end user experience while leveraging inherent vSphere technologies to realize synergies between their product and the vSphere platform. VMware is committed to continue working with OEMs, HW and SW vendors in the hardware acceleration ecosystem. The goal is to allow customers to extract maximum value from their modern infrastructure while easing its management and consumption.
In the following parts of this series, we will detail the steps required and the technologies available to enable one full dedicated GPU in a vSphere VM and how to share a GPU across multiple VMs. Part 2 of this series on using DirectPath I/O for GPUs is here. Part 3 of the series on installing NVIDIA Grid for GPUs on vSphere is here.