We did some work comparing Spark machine learning performance on an 8-server on-premises cluster vs. a similarly configured VMware Cloud on AWS cluster.
Here is what the VMware Cloud on AWS cluster looked like:
Three standard analytic programs from the Spark machine learning library (MLlib), K-means clustering, Logistic Regression classification, and Random Forest decision trees, were driven using spark-perf. In addition, a new, VMware-developed benchmark, IoT Analytics Benchmark, which models real-time machine learning on Internet-of-Things data streams, was used in the comparison. The benchmark is available from GitHub.
As seen in the charts below, performance was very similar on-premises and on VMware Cloud on AWS.