Performance
Boost Throughput by Scaling VMs while Keeping GPUs to a Minimum
This article describes some performance tests we conducted to explore the advantages of virtualizing NVIDIA GPUs with VMware vSphere for generative AI workloads. We tested LLAMA2-7b and LLAMA2-13b parameter models using various virtual GPU (vGPU) and multi-instance GPU (MIG) configurations on vSphere 8.0 U3 with NVIDIA vGPU driver v17.