Poor storage performance is generally the result of high I/O latency, but what can cause high storage performance and how to address it? There are a lot of things that can cause poor storage performance…
– Under sized storage arrays/devices unable to provide the needed performance
– I/O Stack Queue congestion
– I/O Bandwidth saturation, Link/Pipe Saturation
– Host CPU Saturation
– Guest Level Driver and Queuing Interactions
– Incorrectly Tuned Applications
– Under sized storage arrays (Did I say that twice!)
As I mentioned in the previous post the key storage performance indicators to look out for are 1. High Device Latency (DAVG consistently greater than 20 to 30 ms) and 2. High Kernel Latency( KAVG greater than 2 ms). Once you have identified that you have High Latency you can now proceed to trying to understand why the latency is high and what is causing the poor storage performance. In this post, we will look at the top reason for high Device latency.
The Top reason for high device latency is simply not having enough storage hardware to meet your application’s needs (Yes, I have said it a third time now), that is a sure fire way to have storage performance issues. It may seem basic, but too often administrators only size their storage on the capacity size they need to support their environment but not on the Performance IOPS/Latency/Throughput that they need. When sizing your environment you really should consult your Application and Storage Vendor’s best practices and sizing guidelines to understand what storage performance your application will need any what your storage hardware can deliver.
How you configure your storage hardware, the type of drives you use, the raid configuration, the number of disk spindles in the array, etc… will all affect the maximum storage performance your hardware will be able to deliver. Your storage vendor will be able to provide you the most accurate model and advice for the particular storage product you own, but if you need some rough guidance you can use the guidance provided in the chart below.
The slide shows the general IOPs and Read & Write throughput you can expect per spindle depending on the RAID configuration and/or drive type you have in your array. Also frequently I’m asked what is the typical I/O profile for a VM, the guidance varies greatly depending on the applications running in your environment, but a “typical” I/O workload for a VM would roughly be 8KB I/O size, 80% Random, 80% Read. Storage intensive applications like Databases, Mail Servers, Media Streaming, … have their own I/O profiles that may differ greatly from this “typical” profile.
One good way to make sure your storage is able to handle the demands of your datacenter, is to benchmark your storage. There are several free and Open Source tools like IOmeter that can be used to stress test and benchmark your storage. If you haven’t already taken a look at the I/O Analyzer tool delivered as a VMware Fling, you might want to take a peek at it. I/O Analyzer is a virtual appliance tool that provides a simple and standardized approach to storage performance analysis in VMware vSphere virtualized environments ( http://labs.vmware.com/flings/io-analyzer ).
Also when sizing your storage make sure your storage workloads are balanced “appropriately” across the paths in the environment, across the controllers and storage processors in the array and balanced and spread across the appropriate number of spindles in the array. I’ll talk a bit more about “appropriately” balanced later on in this series as it varies depending on your storage array and your particular goals/needs.
Simply sizing your storage correctly for the expected workload, in terms of size and performance capabilities, will go very far to making sure you don’t run into storage performance problems and making sure your Device Latency (DAVG) is less than that 20-30ms guidance. There are other things to consider which we will see in future post, but sizing your storage is key.
PS. This week I’m at the HP Discover Event presenting with Emulex, a VMware Partner, Session TB#3258: The benefits and right practices of 10GbE networking with VMware vSphere 5 @ 2:45 June 6th. If you are at the show come by and say Hi. I’ll also be at the VMware Booth in the main expo hall most days.
Continue to Troubleshooting Storage Performance – Part 3:
/vsphere/2012/06/troubleshooting-storage-performance-in-vsphere-part-3-ssd-performance.html
Previous Troubleshooting Storage Performance post:
http://blogs.vmware.com/vsphere/2012/05/troubleshooting-storage-performance-in-vsphere-part-1-the-basics-.html
iolubik
Good post, but…
Have the numbers been verified? How come in the RAID performance picture the 15k drives get 175 iops max while the drives do 200 iops? 25 iops lost by the RAID controller in raid 0?
How come a SAS 10k drive have the same latency as a 7200 drive?
– FC 10k and SAS 10k don’t have the same latency?
Joseph Dieckhans
Good Observations!! The data points came from several public sources online, you can actually do a quick web search and find several other articles, all have basically the same numbers. For the disk types and IOPS & Latency chart, notice that the use case for each drive changes which also changes its I/O profile (Block Size, Read vs. Write, Sequential vs. Random). The IOPS and Latency were rough based on the use case these drives are typically seen in. So the seemingly inconstant values come from the different I/O profiles being used. But Again, all of these numbers are just ballpark estimates to offer just a general idea of what to expect in terms of storage performance. Storage arrays obviously have a lot of dynamics that effect performance (cache, paths, front end & back end processor speeds, disk types, number of spindles,…) so it is highly recommended to work directly with your storage vendor to get the best estimate for the particular storage array that you have.
louis vuitton tote bag
Nice post. I study something on totally different blogs everyday. It’s going to always be stimulating to read content material from different writers and observe a bit something from their blog.