Maximizing Deep Learning Inference Performance on Intel Architecture in VMware vSphere with Intel Distribution of OpenVINO Toolkit
In collaboration with Dell, Intel, and dividiti, we’ve recently submitted benchmarking results on the VMware vSphere® 7.0 platform with Dell® PowerEdge® servers powered by 2nd Generation Intel® Xeon® Scalable Processors to the MLPerf™ Inference v0.7 round. In this blog, we summarize these results and provide some best practices on deploying and scaling machine learning/deep learning inference workloads on the vSphere platform. We also show that CPU-based inference on vSphere has a very small virtualization overhead compared to the performance of the same task running on similar, non-virtualized servers. Specifically, some of our experiments with the MLPerf benchmark give even better inference throughput on vSphere compared to bare metal due to the oversubscription of CPU cores.