Machine Learning NVIDIA

Extreme Performance Series: Time Sliced vGPU vs MIG vGPU

The Extreme Performance Series 2022 video blogs cover the highlights of recent performance work on VMware technology.

In this video Todd Muirhead talks with Lan Vu about how NVIDIA vGPU allows vSPhere to share GPUs across multiple VMs by using either time sliced vGPU or Multi Instance vGPU profiles.  The differences in performance across a variety of workloads is discussed with recommendations about how to choose the right profile based on workload to maximize the benefits of vGPU and MIG vGPU.

Additional Resources:



3 comments have been added so far

  1. There is one thing I don’t understand about your 3 test cases (light / moderate / heavy load).

    The light scenario is run on a GPU with 5GB memory, with a batch size of 2.
    The moderate scenario is run on a GPU with 20GB memory (x4 compared to light scenario), thus the batch size could be x4 right? And actually, machine learning engineer often want to use the most memory possible with their network architecture.
    Similarly, the heavy scenario is run on a GPU with 40GB memory (x8 compared to light scenario), thus the batch size could be x8.

    Do you have additional results with these heavier test cases?

  2. Thanks Nicolas for listening to our talk and for your question. These 3 test cases are used to illustrate the cases of small, moderate & big ML models. Instead of using three totally different workloads with different ML model sizes, we ran the same MaskRCNN workload with different batch sizes to demonstrate light / moderate / heavy loads so that the comparison of their GPU resource usage patterns is easier. In real use cases, light load is when its ML model has fewer neural network layers and smaller batch size and heavy load is when the ML model is very big with few dozens of ML layers in its neural network & has bigger batch size. We published a white paper with more results of each test cases at this link

    1. I understand. My point simply was that what you call “heavy” load is not heavy enough. Indeed, as a batch size of 2 fits in a 5GB vGPU, at minimum a batch size of 16 would have fitted in the 40GB vGPU.

      And thus with this batch size=16, maybe the performance difference would have been bigger between MIG and non-MIG

Leave a Reply

Your email address will not be published.