New White Paper: High-Performance Virtualized Spark Clusters on Kubernetes for Deep Learning

posted

By Dave Jaffe, VMware Performance Engineering A new white paper is available showing the advantages of running virtualized Spark Deep Learning workloads on Kubernetes. Recent versions of Spark include support for Kubernetes. For Spark on Kubernetes, the Kubernetes scheduler provides the cluster manager capability provided by Yet Another Resource Negotiator (YARN) in typical Spark on Read more...

New white paper: Big Data performance on VMware Cloud on AWS: Spark machine learning and IoT analytics performance on-premises and in the cloud

posted

By Dave Jaffe A new white paper is available comparing Spark machine learning performance on an 8-server on-premises cluster vs. a similarly configured VMware Cloud on AWS cluster. Here is what the VMware Cloud on AWS cluster looked like: Three standard analytic programs from the Spark machine learning library (MLlib), K-means clustering, Logistic Regression classification, Read more...

New White Paper: Fast Virtualized Hadoop and Spark on All-Flash Disks – Best Practices for Optimizing Virtualized Big Data Applications on VMware vSphere 6.5

posted

A new white paper is available showing how to best deploy and configure vSphere 6.5 for Big Data applications such as Hadoop and Spark running on a cluster with fast processors, large memory, and all-flash storage (Non-Volatile Memory Express storage and solid state disks). Hardware, software, and vSphere configuration parameters are documented, as well as Read more...

Introducing TPCx-HS Version 2 – An Industry Standard Benchmark for Apache Spark and Hadoop clusters deployed on premise or in the cloud

posted

Since its release on August 2014, the TPCx-HS Hadoop benchmark has helped drive competition in the Big Data marketplace, generating 23 publications spanning 5 Hadoop distributions, 3 hardware vendors, 2 OS distributions and 1 virtualization platform. By all measures, it has proven to be a successful industry standard benchmark for Hadoop systems. However, the Big Data Read more...