vSGA, or Virtual Shared Graphics Acceleration, allows multiple VMware vSphere virtual machines to share hardware GPUs. We have advocated in previous blog articles the use of NVIDIA GRID vGPU technology, and this is a good solution for many use cases. In this blog, we look at the performance of vGPU technology vs. vSGA while limiting our testing to a workload generated by VMware Horizon 7 VDI desktops. Based on our measurements (we present some of that data in this blog) vSGA provides performance very close to vGPU when using a variety of software applications, including Microsoft Office, Adobe Acrobat, CAD viewers, YouTube video, and viewing or working with WebGL-based images.
All modern PC operating systems expect the device on which they are running to have a GPU to provide a reasonably acceptable level of user experience. VDI desktops, which run on servers in the cloud, need to support 3D graphics to provide a good user experience. One approach to support 3D graphics for VMware Horizon VDI desktops is to use the NVIDIA GRID vGPU solution. An alternate approach is to use the VMware vSGA stack, shown in Figure 1, to support 3D graphics for Horizon VDI desktops.
Based on our experiments, we recommend that VDI desktops use VMware vSGA to support 3D graphics for many applications, including (but not limited to):
- Microsoft Office applications
- Adobe Acrobat
- YouTube/instructional/training video playback
- CAD viewers
- WebGL
We still recommend NVIDIA GRID vGPU on vSphere for compute-intensive applications such as 3D modeling/CAD design, high-performance computing (HPC), data science, and AI. See NVIDIA and VMware Enterprise GPU virtualization for details.
In this blog, we compare the quality of user experience in VDI with three different configurations:
- A CPU-only desktop
- VDI using vSGA
- VDI using GRID vGPU
Our results show:
- vSGA offers a significant improvement in user experience over a CPU-only VDI desktop.
- vSGA performance, measured in terms of user experience, is very close to that of GRID vGPU.
Testbed configuration
Figure 2 and Table 1 show our testbed configuration for the experiments we conducted to compare VMware vSGA vs. NVIDIA GRID vGPU solutions.
Parameter | Value/Configuration |
VCPUS | 2 |
Memory | 8 GB |
Disk | 64 GB |
OS | Windows 10 Enterprise |
Applications Installed | Office 2013, Chrome Browser, Adobe Reader |
VDI Protocol | Blast |
VRAM | 96 MB |
vSGA (3D Memory) | 512 MB |
vGPU Profile | M60-1b |
VMware Horizon | Version 7.6 |
VDI desktop resolution | 1600×1200 |
Table 1: VDI Desktop VM Configuration.
Description of experiments
First, we ran three different tests: two of which used PowerPoint, and the third used a web page with content from YouTube. All tests were run using all three configurations for the VDI desktop: CPU only, vSGA, and vGPU.
In the first experiment, we ran all three configurations, and we recorded the contents of the screen (that is, took screenshots) of the VDI desktop and used the screenshots to demonstrate the improvement in user experience, including:
- Improvement in frames per second (FPS)
- Smoothness and focus obtained with a vSGA-enabled VDI desktop compared to a CPU-only VDI desktop
The screenshots were made on the VDI desktop, so the remoting protocol was not used in any way and did not have any impact on the quality of the user experience we recorded.
In a second experiment, we installed VMware Horizon in the VDI desktop. We created a second Windows 10 VM, as shown in Figure 3, on the same server and used this VM as a VMware Horizon 7 client. We connected to the VDI desktop from this Horizon client running in a VM and measured the FPS, smoothness, and amount of distortion in the image for the three tests described in the previous paragraph. The goal of this experiment was to quantify the substantial improvement in the user experience from using vSGA for VDI and to demonstrate that the user experience obtained using vSGA is very close to that obtained using vGPU.
Finally, in a third experiment, we compared the performance of a VDI desktop using WebGL benchmarks for all three configurations. The results are presented in the next section.
Results
Experiment 1
Figure 4 shows a side-by-side comparison of an animation running in a VDI desktop featuring a CPU only with that of a VDI desktop featuring vSGA technology.
The CPU-only version finished displaying the animation faster because it displays fewer frames and is less smooth. The vSGA version displays almost all the frames in the animation, which is the expected behavior.
To quantify this FPS and smoothness advantage due to vSGA, we ran a second test, in which we connected to the VDI desktop from a client in a VM, and we recorded the frames as seen on the Horizon client.
Figure 5 shows a comparison of the user experience measured in terms of FPS (the blue bars) and smoothness (orange bars) when a remoting protocol is interposed between the observer and the VDI desktop. The purpose is to show that the improved user experience persists even when a remoting protocol is interposed between the observer and the VDI desktop.
To measure FPS, we first converted the screenshots to grayscale images, applied the Laplace transform (which reduces the amount of data to process while maintaining the structural aspects), and then computed the SSIM for every pair of successive screenshots. (Based on our testing, this method of identifying distinct frames provides greater fidelity than simply computing the SSIM of screenshots.) If the SSIM value was less than a certain threshold, we tested the two images to determine if one was a blurred version of the other. If they were not, we counted them as distinct frames. The pair-wise SSIM values, for a sequence of screenshots, constituted a time series. We computed a smoothness metric from this time series. The normalized smoothness metric is shown in Figure 5, above.
Clearly, the CPU-only VDI desktop shows markedly lower smoothness for the animation. The smoothness and FPS obtained using the vSGA stack is very close to that obtained using the vGPU stack, which demonstrates that the user experience with vSGA is close to that with vGPU, and it is significantly better than a CPU-only solution.
For a second comparison of user experience, we ran a simple video embedded in a PowerPoint slide. We recorded the screenshots as seen on the VDI desktop with the PowerPoint slide as shown, using a CPU-only solution and a vSGA graphics stack. A side-by-side comparison is shown in Figure 6.
Experiment 2
In a second experiment, we recorded the playback of this video embedded in a PowerPoint slide from a Horizon client. We analyzed the screenshots to compute the FPS for this embedded video playback on the three configurations: CPU-only, vSGA, and vGPU. We also computed a measure of the number of pixels impacted by artifacts for the CPU-only, vSGA, and vGPU configurations. We normalized this artifact measure using the vGPU configuration as the baseline. This data is shown in Figure 7.
For a third comparison of user experience, we captured screenshots, shown in Figures 8 and 9, on the VDI desktop while playing a YouTube video. Figure 8 shows a side-by-side comparison of two screenshots: one from a CPU-only configuration, and the other from a vSGA configuration. The figure shows the CPU-only version is so badly blurred that the letters are illegible. The vSGA configuration shows no such artifacts.
Experiment 3
In a third set of experiments, we ran some WebGL benchmarks using all three configurations: CPU-only, vSGA, and vGPU. The data obtained by running these benchmarks is shown in Table 2.
Test / Benchmark | vSGA | CPU-only | vGPU (M60-1b) |
WebGL Aquarium | 40 fps | 4 fps | 60 fps |
WebGL Unity3D | 42,371 | 23,020 | 56,307 |
WebGL Bmark | 1174 | 720 | 2079 |
Table 2: Comparison of WebGL benchmark performance for all three configurations.
From the benchmark results, we can see that vSGA performance is close to vGPU performance and is significantly higher than the performance of the CPU-only configuration. In the case of the WebGL benchmark Aquarium, the performance using vSGA is much larger than the CPU-only configuration.
Key takeaways
- VDI desktops that run today’s operating systems need GPU support to deliver a reasonably acceptable level of user experience. The VMware vSGA stack offers a user experience that is superior to that with a CPU-only desktop, and close to that available using NVIDIA GRID vGPU.
- For a typical VDI environment in which good graphics performance is desirable, including slideshow animation and video stream playback, we recommend enabling vSGA with hardware acceleration.
- Both the vSGA and vGPU stacks support vMotion. However, vGPU only allows vMotion of a VM if the source and destination server have identical GPUs and matching drivers. vSGA has no such limitation; it supports vMotion between different generations of cards or different host driver versions.
- We have implemented a mechanism to measure the FPS, smoothness, and level of distortion in the contents displayed on the desktop. The mechanism is independent of the applications that are run in the desktop, requires no access to the VDI desktop, and does not require prior knowledge of the applications that are being monitored. We just compare the measurements made on the VDI client with reference measurements made, say, by running the applications on a local laptop, to generate a quantitative measure of the user experience at the VDI client compared to that on the local laptop. This comparison gives a quantitative measure of the user experience for VDI when compared to using a local laptop.
Future work
In the future, we plan to:
- Compare the performance of the three stacks using video playback of instructional/training videos.
- Design and implement a metric to measure how much “out of focus” a rendered image is. We plan to use such a metric to quantify the improvement in performance of vSGA compared to a CPU-only desktop for screenshots like those in Figures 8 and 9.