VMware Storage Policy Based Management (SPBM) is a storage policy framework that helps administrators match VM workload requirements against storage capabilities. SPBM runs as an independent service in the vCenter Server. We recently released a white paper that covers SPBM performance in two sections.
A new white paper is available showing how to best deploy and configure vSphere 6.5 for Big Data applications such as Hadoop and Spark running on a cluster with fast processors, large memory, and all-flash storage (Non-Volatile Memory Express storage and solid state disks). Hardware, software, and vSphere configuration parameters are documented, as well as tuning parameters for the operating system, Hadoop, and Spark.
The best practices were tested on a 13-server cluster, with Hadoop installed on vSphere as well as on bare metal. Workloads for both Hadoop (TeraSort and TestDFSIO) and Spark Machine Learning Library routines (K-means clustering, Logistic Regression classification, and Random Forest decision trees) were run on the cluster. Configurations with 1, 2, and 4 VMs per host were tested as well as bare metal. Among the 4 virtualized configurations, 4 VMs per host ran fastest due to the best utilization of storage as well as the highest percentage of data transfer within a server. The 4 VMs per host configuration also ran faster than bare metal on all Hadoop and Spark tests but one.
I’m excited to announce that the “Extreme Performance Series” is back for its 5th year, and with 7 additional sessions, it’s our largest year ever! These sessions are created and presented by VMware’s best and most distinguished performance engineers, principals, architects and gurus. You do not want to miss these advanced sessions.
VMmark 3.0, VMware’s multi-host virtualization benchmark is generally available here. VMmark3 is a free cluster-level benchmark that measures the performance, scalability, and power of virtualization platforms.
VMmark3 leverages much of previous VMmark generations’ technologies and design. It continues to utilize a unique tile-based heterogeneous workload application design. It also deploys the platform-level workloads found in VMmark2 such as vMotion, Storage vMotion, and Clone & Deploy. In addition to incorporating new and updated application workloads and infrastructure operations, VMmark3 also introduces a new fully automated provisioning service that greatly reduces deployment complexity and time.
New this year for VMworld 2017 in Las Vegas, we will be offering a pre-VMworld bootcamp focused on vSphere platform performance. Specific SQL and Oracle bootcamps will still be offered, but we have had many requests for a workload agnostic program. This bootcamp will enable you to confidently support all your virtual workloads and give you an opportunity to directly interact with VMware Performance Engineering.
Since its release on August 2014, the TPCx-HS Hadoop benchmark has helped drive competition in the Big Data marketplace, generating 23 publications spanning 5 Hadoop distributions, 3 hardware vendors, 2 OS distributions and 1 virtualization platform. By all measures, it has proven to be a successful industry standard benchmark for Hadoop systems. However, the Big Data landscape has rapidly changed over the last 30 months. Key technologies have matured while new ones have risen to prominence in an effort to keep pace with the exponential expansion of datasets. One such technology is Apache Spark.
According to a Big Data survey published by the Taneja Group, more than half of the respondents reported actively using Spark, with a notable increase in usage over the 12 months following the survey. Clearly, Spark is an important component of any Big Data pipeline today. Interestingly, but not surprisingly, there is also a significant trend towards deploying Spark in the cloud. What is driving this adoption of Spark? Predominantly, performance.
Today, with the widespread adoption of Spark and its integration into many commercial Big Data platform offerings, I believe there needs to be a straightforward, industry standard way in which Spark performance and price/performance could be objectively measured and verified. Just like TPCx-HS Version 1 for Hadoop, the workload needs to be well understood and the metrics easily relatable to the end user.
Continuing on the Transaction Processing Performance Council’s commitment to bringing relevant benchmarks to the industry, it is my pleasure to announce TPCx-HS Version 2 for Spark and Hadoop. In keeping with important industry trends, not only does TPCx-HS support traditional on premise deployments, but also cloud.
I envision that TPCx-HS will continue to be a useful benchmark standard for customers as they evaluate Big Data deployments in terms of performance and price/performance, and for vendors in demonstrating the competitiveness of their products.
(Chair, TPCx-HS Benchmark Committee)
Additional Information: TPC Press Release
We have just published a new whitepaper on the performance of Oracle databases on vSphere 6.5 monster virtual machines. We took a look at the performance of the largest virtual machines possible on the previous four generations of four-socket Intel-based servers. The results show how performance of these large virtual machines continues to scale with the increases and improvements in server hardware.
In addition to vSphere 6.5 and the four-socket Intel-based servers used in the testing, an IBM FlashSystem A9000 high performance all flash array was used. This array provided extreme low latency performance that enabled the database virtual machines to perform at the achieved high levels of performance.
Please read the full paper, Oracle Monster Virtual Machine Performance on VMware vSphere 6.5, for details on hardware, software, test setup, results, and more cool graphs. The paper also covers performance gain from Hyper-Threading, performance effect of NUMA, and best practices for Oracle monster virtual machines. These best practices are focused on monster virtual machines, and it is recommended to also check out the full Oracle Databases on VMware Best Practices Guide.
Some similar tests with Microsoft SQL Server monster virtual machines were also recently completed on vSphere 6.5 by my colleague David Morse. Please see his blog post and whitepaper for the full details.
This work on Oracle is in some ways a follow up to Project Capstone from 2015 and the resulting whitepaper Peeking at the Future with Giant Monster Virtual Machines . That project dealt with monster VM performance from a slightly different angle and might be interesting to those who are also interested in this paper and its results.
This week SPEC has published a new SPEC CloudTM IaaS 2016 result for a private cloud configuration built using VMware vSphere 6.5 and VMware Integrated OpenStack 3.1 (VIO 3.1) and Dell PowerEdge Servers. Working with VMware, Dell has pushed their lead in cloud performance even further. This time, the primary metric produced was a Scalability score of 78.5 @ 72 Application Instances (468 VMs). The Elasticity score was 87.4%.
VMware and Dell are active participants in SPEC and have contributed to the development of its industry standard benchmarks including SPEC Cloud IaaS 2016. Both organizations strongly support SPEC’s mission to provide a set of fair and realistic metrics on which to differentiate modern systems and technologies.
Weathervane is a performance benchmarking tool developed at VMware. It lets you assess the performance of your virtualized or cloud environment by driving a load against a realistic application and capturing relevant performance metrics. You might use it to compare the performance characteristics of two different environments, or to understand the performance impact of some change in an existing environment.
Weathervane is very flexible, allowing you to configure almost every aspect of a test, and yet is easy to use thanks to tools that help prepare your test environment and a powerful run harness that automates almost every aspect of your performance tests. You can typically go from a fresh start to running performance tests with a large multi-tier application in a single day.
Weathervane supports a number of advanced capabilities, such as deploying multiple independent application instances, deploying application services in containers, driving variable loads, and allowing run-time configuration changes for measuring elasticity-related performance metrics.
Weathervane has been used extensively within VMware, and is now open source and available on GitHub at https://github.com/vmware/weathervane.
The rest of this blog gives an overview of the primary features of Weathervane.