Home > Blogs > VMware VROOM! Blog > Tag Archives: benchmarks

Tag Archives: benchmarks

New Scheduler Option for vSphere 6.7 U2

Along with the recent release of VMware vSphere 6.7 U2, we published a new whitepaper that shows the performance of a new scheduler option that was included in the 6.7 U2 update.  We referred to this new scheduler option internally as the “sibling” scheduler, but the official name is the side-channel aware scheduler version 2, or SCAv2.  The whitepaper includes full details about SCAv1 and SCAv2, the L1TF security vulnerability that made them necessary, and the performance implications with several different workload types.  This blog is a brief overview of the key points, but we recommend that you check out the full document.

In August of 2018, a security vulnerability known as L1TF, affecting systems using Intel processors, was revealed, and patches and remediations were also made available. Intel provided micro-code updates for its processors, operating system patches were made available, and VMware provided an update for vSphere. The full details of the vCenter and ESXi patches are in a VMware security advisory that links to individual KB articles.

Continue reading

IoT Analytics Benchmark adds neural network–based deep learning with Keras and BigDL

The IoT Analytics Benchmark released last year dealt with an important Internet of Things use case—monitoring factory sensor data for impending failure conditions. This year, we are tackling an equally important use case—image classification. Whether used in facial recognition, license plate readers, inspection systems, or autonomous vehicles, neural network–based deep learning is making image detection and classification a viable technology.

As in the classic machine learning used in the original IoT Analytics Benchmark code (which used the Spark Machine Learning Library), the new deep learning code first trains a model using pre-labeled images and then deploys that model to infer the classification of new images. For IoT this inference step is the most important. Thus, the new programs, designated as IoT Analytics Benchmark DL, use previously trained models (included in the kit) to demonstrate inferencing that can be performed at the edge (on small gateway systems) or in scaled-out Spark clusters.

Continue reading

Introducing TPCx-HS Version 2 – An Industry Standard Benchmark for Apache Spark and Hadoop clusters deployed on premise or in the cloud

Since its release on August 2014, the TPCx-HS Hadoop benchmark has helped drive competition in the Big Data marketplace, generating 23 publications spanning 5 Hadoop distributions, 3 hardware vendors, 2 OS distributions and 1 virtualization platform. By all measures, it has proven to be a successful industry standard benchmark for Hadoop systems. However, the Big Data landscape has rapidly changed over the last 30 months. Key technologies have matured while new ones have risen to prominence in an effort to keep pace with the exponential expansion of datasets. One such technology is Apache Spark.

spark-logo-trademarkAccording to a Big Data survey published by the Taneja Group, more than half of the respondents reported actively using Spark, with a notable increase in usage over the 12 months following the survey. Clearly, Spark is an important component of any Big Data pipeline today. Interestingly, but not surprisingly, there is also a significant trend towards deploying Spark in the cloud. What is driving this adoption of Spark? Predominantly, performance.

Today, with the widespread adoption of Spark and its integration into many commercial Big Data platform offerings, I believe there needs to be a straightforward, industry standard way in which Spark performance and price/performance could be objectively measured and verified. Just like TPCx-HS Version 1 for Hadoop, the workload needs to be well understood and the metrics easily relatable to the end user.

Continuing on the Transaction Processing Performance Council’s commitment to bringing relevant benchmarks to the industry, it is my pleasure to announce TPCx-HS Version 2 for Spark and Hadoop. In keeping with important industry trends, not only does TPCx-HS support traditional on premise deployments, but also cloud.

I envision that TPCx-HS will continue to be a useful benchmark standard for customers as they evaluate Big Data deployments in terms of performance and price/performance, and for vendors in demonstrating the competitiveness of their products.

 

Tariq Magdon-Ismail

(Chair, TPCx-HS Benchmark Committee)

 

Additional Information:  TPC Press Release

Weathervane, a benchmarking tool for virtualized infrastructure and the cloud, is now open source.

Weathervane is a performance benchmarking tool developed at VMware.  It lets you assess the performance of your virtualized or cloud environment by driving a load against a realistic application and capturing relevant performance metrics.  You might use it to compare the performance characteristics of two different environments, or to understand the performance impact of some change in an existing environment.

Weathervane is very flexible, allowing you to configure almost every aspect of a test, and yet is easy to use thanks to tools that help prepare your test environment and a powerful run harness that automates almost every aspect of your performance tests.  You can typically go from a fresh start to running performance tests with a large multi-tier application in a single day.

Weathervane supports a number of advanced capabilities, such as deploying multiple independent application instances, deploying application services in containers, driving variable loads, and allowing run-time configuration changes for measuring elasticity-related performance metrics.

Weathervane has been used extensively within VMware, and is now open source and available on GitHub at https://github.com/vmware/weathervane.

The rest of this blog gives an overview of the primary features of Weathervane.

Continue reading

Machine Learning on vSphere 6 with Nvidia GPUs – Episode 2

by Hari Sivaraman, Uday Kurkure, and Lan Vu

In a previous blog [1], we looked at how machine learning workloads (MNIST and CIFAR-10) using TensorFlow running in vSphere 6 VMs in an NVIDIA GRID configuration reduced the training time from hours to minutes when compared to the same system running no virtual GPUs.

Here, we extend our study to multiple workloads—3D CAD and machine learning—run at the same time vs. run independently on a same vSphere server.

This is episode 2 of a series of blogs on machine learning with vSphere. Also see:

Continue reading

New White Paper: Best Practices for Optimizing Big Data Performance on vSphere 6

A new white paper is available showing how to best deploy and configure vSphere for Big Data applications such as Hadoop and Spark. Hardware, software, and vSphere configuration parameters are documented, as well as tuning parameters for the operating system, Hadoop, and Spark.

The best practices were tested on a Dell 12-server cluster, with Hadoop installed on vSphere as well as on bare metal. Workloads for both Hadoop (TeraSort and TestDFSIO) and Spark (Support Vector Machines and Logistic Regression) were run on the cluster. The virtualized cluster outperformed the bare metal cluster by 5-10% for all MapReduce and Spark workloads with the exception of one Spark workload, which ran at parity. All workloads showed excellent scaling from 5 to 10 worker servers and from smaller to larger dataset sizes.

Continue reading