With Storage I/O Control (SIOC), vSphere 6.5 administrators can adjust the storage performance of VMs so that VMs with critical workloads will get the I/Os per second (IOPS) they need. Admins assign shares (the proportion of IOPS allocated to the VM), limits (the upper bound of VM IOPS), and reservations (the lower bound of VM IOPS) to the VMs whose IOPS need to be controlled. After shares, limits, and reservations have been set, SIOC is automatically triggered to meet the desired policies for the VMs.
A recently published paper shows the performance of SIOC meets expectations and successfully controls the number of IOPS for VM workloads.
VMware IOInsight is a tool to help people understand a VM’s storage I/O behavior. By understanding their VM’s I/O characteristics, customers can make better decisions about storage capacity planning and performance tuning. IOInsight ships as a virtual appliance that can be deployed in any vSphere environment and includes an intuitive web-based UI that allows users to choose VMDKs to monitor and view results.
Where does IOInsight help?
Customers may better tune and size their storage.
When contacting VMware Support for any vSphere storage issues, including a report from IOInsight can help VMware Support better understand the issues and can potentially lead to faster resolutions.
VMware Engineering can optimize products with a better understanding of various customers’ application behavior.
IOInsight captures I/O traces from ESXi and generates various aggregated metrics that represent the I/O behavior. The IOInsight report contains only these aggregated metrics and there is no sensitive information about the application itself. In addition to the built-in metrics computed by IOInsight, users can also write new analyzer plugins to IOInsight and visualize the results. A comprehensive SDK and development guide is included in the download bundle.
We compared the I/O performance of vSphere 6.0 U2 over 16Gb and 32Gb Emulex FC HBAs connected via a Brocade G620 FC switch to an EMC VNX7500 storage array.
Iometer, a common microbenchmark, was used to generate the workload for various block sizes. For single-VM experiments, we measured sequential read and sequential write throughput. For multi-VM experiments, we measured random read IOPS and throughput.
Our experiments showed that vSphere 6 can achieve near line rate with 32Gb FC.
Virtual SAN is a VMware storage solution that is tightly integrated with vSphere—making storage setup and maintenance in a vSphere virtualized environment fast and flexible. Virtual SAN 6.2 adds several features and improvements, including additional data integrity with software checksum, space efficiency features of RAID-5 and RAID-6, deduplication and compression, and an in-memory client read cache.
We ran several tests to compare the performance of Virtual SAN 6.1 and 6.2 to make sure they were on par with each other.
VMware vSphere Fault Tolerance (FT) provides continuous availability to virtual machines that require a high amount of uptime. If the virtual machine fails, another virtual machine is ready to take over the job. vSphere achieves FT by maintaining primary and secondary virtual machines using a new technology named Fast Checkpointing. This technology is similar to Storage vMotion, which copies the virtual machine state (storage, memory, and networking) to the secondary ESXi host. Fast Checkpointing keeps the primary and secondary virtual machines in sync.
Performance studies have previously shown that there is no doubt virtualized servers can run a variety of applications near, or in some cases even above, that of software running natively (on bare metal). In a new white paper, we raise the bar higher and test “monster” vSphere virtual machines loaded with CPU and running the most taxing databases and transaction processing applications.
The benchmark workload, which we call Order-Entry, is based on an industry-standard online transaction processing (OLTP) benchmark called TPC-C. Both rigorous and demanding, the Order-Entry workload pushes virtual machine performance.
Note: The Order Entry benchmark is derived from the TPC-C workload, but is not compliant with the TPC-C specification, and its results are not comparable to TPC-C results.
The white paper quantifies the:
Performance differential between ESXi 6.0 and native
Performance differential between ESXi 6.0 and ESXi 5.1
Performance gains due to enhancements built into ESXi 6.0
vSphere APIs for I/O Filtering (VAIO) is a framework that enables third-party software developers to implement data services, such as caching and replication, to vSphere. Figure 1 below shows the general architecture of VAIO. Once I/O filter libraries are installed to a virtual disk (VMDK), every I/O request generated from the guest operating system to the VMDK will first be intercepted by the VAIO framework at the file device layer. The VAIO framework then hands over the I/O request to the user space I/O filter libraries, where a series of third party data service operations can be performed against the I/O. After processing the I/O, user space I/O filter libraries return the I/O back to the VAIO framework, which continues the rest of the issuing path. Similarly, upon completion, the I/O will first be processed by the user space I/O filter libraries before continuing its original completion path.
There have been questions around the overhead of the VAIO framework due to its extra user-to-kernel communication. In this blog post, we evaluate the performance of vSphere APIs for I/O Filtering using a null I/O filter and demonstrate how VAIO scales with respect to the number of virtual machines and outstanding I/Os (OIOs). The null I/O filter accepts each I/O request and immediately returns it.
The networking stack of vSphere is, by default, tuned to balance the tradeoffs between CPU cost and latency to provide good performance across a wide variety of applications. However, there are some cases where using a tunable provides better performance. An example is Web-farm workloads, or any circumstance where a high consolidation ratio (lots of VMs on a single ESXi host) is preferred over extremely low end-to-end latency. VMware vSphere 6.0 introduces the Dynamic Host-Wide Performance Tuning feature (also known as dense mode), which provides a single configuration option to dynamically optimize individual ESXi hosts for high consolidation scenarios under certain use cases. Later in this blog, we define those use cases. Right now, we take a look at how dense mode works from an internal viewpoint.
VMware Virtual SAN 6.1 introduced the concept of a stretched cluster which allows the Virtual SAN customer to configure two geographically located sites, while synchronously replicating data between the two sites. A technical white paper about the Virtual SAN stretched cluster performance has now been published. This paper provides guidelines on how to get the best performance for applications deployed on a Virtual SAN stretched cluster environment.
The chart below, borrowed from the white paper, compares the performance of the Virtual SAN 6.1 stretched cluster deployment against the regular Virtual SAN cluster without any fault domains. A nine- node Virtual SAN stretched cluster is considered with two different configurations of inter-site latency: 1ms and 5ms. The DVD Store benchmark is executed on four virtual machines on each host of the nine-node Virtual SAN stretched cluster. The DVD Store performance metrics of cumulated orders per minute in the cluster, read/write IOPs, and average latency are compared with a similar workload on the regular Virtual SAN cluster. The orders per minute (OPM) is lower by 3% and 6% for the 1ms and 5ms inter-site latency stretched cluster compared to the regular Virtual SAN cluster.
Figure 1a. DVD Store orders per minute in the cluster and guest IOPS comparison
Guest read/write IOPS and latency were also monitored. The read/write mix ratio for the DVD Store workload is roughly at 1/3 read and 2/3 write. Write latency shows an obvious increase trend when the inter-site latency is higher, while the read latency is only marginally impacted. As a result, the average latency increases from 2.4ms to 2.7ms, and 5.1ms for 1ms and 5ms inter-site latency configuration.
Figure 1b. DVD Store latency comparison
These results demonstrate that the inter-site latency in a Virtual SAN stretched cluster deployment has a marginal performance impact on a commercial workload like DVD Store. More results are available in the white paper.
A technical white paper about Virtual SAN performance has been published. This paper provides guidelines on how to get the best performance for applications deployed on a Virtual SAN cluster.
We used Iometer to generate several workloads that simulate various I/O encountered in Virtual SAN production environments. These are shown in the following table.
Type of I/O workload
Size (1KiB = 1024 bytes)
Shows / Simulates
Maximum random read IOPS that a storage solution can deliver
70% / 30%
Typical commercial applications deployed in a VSAN cluster
Video streaming from storage
Copying bulk data to storage
Sequential Mixed R/W
70% / 30%
Simultaneous read/write copy from/to storage
In addition to these workloads, we studied Virtual SAN caching tier designs and the effect of Virtual SAN configuration parameters on the Virtual SAN test bed.
Virtual SAN 6.0 can be configured in two ways: Hybrid and All-Flash. Hybrid uses a combination of hard disks (HDDs) to provide storage and a flash tier (SSDs) to provide caching. The All-Flash solution uses all SSDs for storage and caching.
Tests show that the Hybrid Virtual SAN cluster performs extremely well when the working set is fully cached for random access workloads, and also for all sequential access workloads. The All-Flash Virtual SAN cluster, which performs well for random access workloads with large working sets, may be deployed in cases where the working set is too large to fit in a cache. All workloads scale linearly in both types of Virtual SAN clusters—as more hosts and more disk groups per host are added, Virtual SAN sees a corresponding increase in its ability to handle larger workloads. Virtual SAN offers an excellent way to scale up the cluster as performance requirements increase.