Home > Blogs > VMware vSphere Blog > Tag Archives: storage

Tag Archives: storage

Is vSphere Replication storage agnostic even when using SRM?

In short: Yes, it sure is!

In this post I’ll show 6 VMs being protected with vSphere Replication.  2 VMs each will reside on fibre channel data stores (EMC CX4), iSCSI data stores (Falconstor NSS Gateway), and an NFS datastore (EMC VNX5500).  I’ll replicate them onto different datastores, fail them over, reprotect, and fallback.

Continue reading

Interesting Storage Stuff at VMworld 2012 Barcelona

Last month, I put together a short article on what I thought would be interesting to check out from a VMworld 2012 San Francisco perspective. On the run up to VMworld 2012 Barcelona, I thought I would do a similar post.

Disclaimer – Once again, the vSphere storage blog has to remain storage vendor neutral to retain any credibility. VMware doesn’t favour any one storage partner over another. I’m not personally endorsing any of these vendor’s products either. What I’m posting here are just a few vendors/products that are interesting to me from a storage perspective.

Pure Storage – Lot of new and very cool vSphere integration features, include a new web client management plugin for vSphere 5.1.They also have new protocols and VAAI features. Read more about it here.

Tintri - They won Best of VMworld 2012 Gold award in Hardware for Virtualization in San Francisco. New snapshot and cloning functionality, new VAAI NAS primitives & some upcoming cool replication technology. More detail here.

Violin Memory – The flash storage backing the new Monster VM and 1,000,000 IOPs from a single VM which was mentioned by Steve Herrod in the keynotes. Fantastic technology, and well worth a look. Lots of vSphere integration features in the works too. More here.

Nimbus Data – Recently announced their new Gemini all flash storage array. Make their won flash drives and claim to support them for 10 years. Met with them at VMworld 2012 in San Francisco and was very impressed by their technology. You can read more about them here.

As you can tell, flash is hot right now. Three of the four vendors mentioned here are all flash array vendors. I suspect we are going to see more and more all flash arrays in the not too distant future. VMworld is a great opportunity to catch up what is happening in the world of storage. And these folks love talking about their technologies. Go check them out in Barcelona.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage

VAAI Offloads and KAVG Latency

Now this is an interesting one, and something that I had not noticed before. Nor have many other people I suspect. One of our customers was observing high KAVG (kernel latency) during Storage vMotion operations (which were leveraging the VAAI XCOPY primitive). The KAVG latency was showing an expected value when VAAI was disabled. I’d previously seen high KAVG values when queue full conditions were also being observed. This was not the case here. Below are two esxtop outputs showing the symptoms. The first of these screenshots is with VAAI enabled. Notice the high KAVG/cmd value. Click on the image for a larger view.

esxtop counters with VAAI enabledNow we have the same esxtop counters with VAAI disabled. In this case, SCSI Commands, Reads and Writes are very much higher since the I/O is not being offloaded to the array, but we do see an expected small value for KAVG/cmd. Again, you can click on the image for a larger view.

esxtop counters with VAAI disabledSo what’s the cause? Well, eventually the explanation was found in the following KB article – Abnormal DAVG and KAVG values observed during VAAI operations. (Nice catch Henrik!) Reproducing verbatim the contents of the KB article, “when VAAI commands are issued via VAAI Filter, there are actually 2 commands sent. These are top-layer commands which are issued and are never sent to the actual device (they stay within the ESX kernel). These commands are intercepted by the VAAI filter and the VAAI plugin, and are replaced by the vendor-specific commands, which are issued to the device.

This is why esxtop shows device statistics for the top-level commands only, and as a result the values for DAVG and KAVG seem unusual when compared to results obtained when VAAI is not enabled .
 
In this instance (and only for this instance), the DAVG and KAVG observed in esxtop should not be interpreted as a performance issue, absent of any other symptoms.”

So there you go. There is always something to learn in this job.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

vSphere 5.1 New Storage Features

vSphere 5.1 is upon us. The following is a list of the major storage enhancements introduced with the vSphere 5.1 release.

VMFS File Sharing Limits

In previous versions of vSphere, the maximum number of hosts which could share a read-only file on a VMFS volume was 8. The primary use case for multiple hosts sharing read-only files is of course linked clones, where linked clones located on separate hosts all shared the same base disk image. In vSphere 5.1, with the introduction of a new locking mechanism, the number of hosts which can share a read-only file on a VMFS volume has been increased to 32. This makes VMFS as scalable as NFS for VDI deployments & vCloud Director deployments which use linked clones.

Space Efficient Sparse Virtual Disks

A new Space Efficient Sparse Virtual Disk aims to address certain limitations with Virtual Disks. The first of these is the ability to reclaim stale or stranded data in the Guest OS filesystem/database. SE Sparse Disks introduces an automated mechanism for reclaiming stranded space. The other feature is a dynamic block allocation unit size. SE Sparse disks have a new configurable block allocation size which can be tuned to the recommendations of the storage arrays vendor, or indeed the applications running inside of the Guest OS. VMware View is the only product that will use the new SE Sparse Disk in vSphere 5.1.

Continue reading

Advanced VMkernel Settings for Disk Storage

As regular readers will know by now, many of these blog posts are a result of internal discussions held between myself and other VMware folks (or indeed storage partners). This one is no different. I was recently involved in a discussion about how VMs did sequential I/O, which led me to point out a number of VMkernel parameters related to performance vs fairness for VM I/O. In fact, I have seen other postings about these parameters, but I realised that I never did post anything myself. 

A word of caution! These parameters have already been fine tuned by VMware. There should be no need to modify these parameters. If you do, you risk impacting your own environment. As mentioned, this is all about performance vs fairness. Tuning these values can give you some very fast VMs but can also give you some very slow ones. You've been warned.

Disk.SchedNumReqOutstanding
This is the maximum number of I/Os one VM can issue all the way down to the LUN when there is more than one VM pushing I/O to the same LUN – the default was 16 in pre ESX 3.5. This was bumped to 32 in ESX 3.5, and remains at 32 today.

Disk.SchedQuantum
The maximum number of consecutive “sequential” I/O’s allowed from one VM before we force a switch to another VM (unless this is the only VM on the LUN). Disk.SchedQuantum is set to a default value of 8.
But how do we figure out if the next I/O is sequential or not? That's a good question.

Disk.SectorMaxDiff
As mentioned, we need a figure of ‘proximity’ to see if the next I/O of a VM is ‘sequential’. If it is, then we give the VM the benefit of getting the next I/O slot as it will likely be served faster by the storage. If it is outside this proximity, then the I/O goes to the next VM for fairness. This value is the maximum distance in disk sectors when considering if two I/Os are “sequential”. Disk.SectorMaxDiff defaults to 2000 sectors.

Disk.SchedQControlVMSwitches
This value is used to determine when to throttle down the amount of I/Os sent by one VM to the queue. It refers to the number of times we switch between VMs to handle I/O – if we switch this many times, then we reduce the maximum number of commands that can be queued. The default is 6 switches.

Disk.SchedQControlSeqReqs
This is used to determine when to throttle back up to the full queue depth. It refers to the number of times we issue I/O’s from the same VM before we go back to using the full LUN queue depth. The default is 128. In other words, if the same VM issues 128 I/Os without any other VM wishing to issue I/Os in the same timeframe, we throttle the number of I/Os per VM back to its maximum.

While researching for this post, I came across a bunch of other advanced disk parameters in my notes which I though you might like to know about.

Disk.PathEvalTime
Amount of time to wait before checking status of failed path. The default is 300 seconds (5 minutes). This means that if you have a preferred path (fixed path policy) and you have failed over to an alternate path, every 300 seconds the VMkernel will issue a TUR (Test Unit Ready) SCSI command to see if the preferred path has come back online. When it does, I/O will be moved back to the preferred path.

Disk.SupportSparseLUN
Wow – this setting brings me back. Let's say that the SAN administrator presented LUN 0,1,2 & 4,5,6 to your ESXi host. If Disk.SupportSparseLUN is turned off, when we found the gap in LUNs, we wouldn't find any LUNs beyond this point. Having Disk.SupportSparseLUN enabled (which it is by default) means that we can traverse these gaps in LUNs. I'm pretty sure this is only relevant to the SCSI Bus Walking discovery method – see the next advanced setting.

Disk.UseReportLUN
The storage stack uses the SCSI REPORT_LUNS command to detect LUNs on a target. The SCSI REPORT LUNS command requests a target to return a logical unit inventory (LUN list) to the initiator rather than querying each LUN individually, i.e. SCSI Bus Walking. The option is enabled by default. Believe me, you do not want to use SCSI bus walking unless you get a kick out of having a really slow ESXi boot time.

Disk.UseDeviceReset & Disk.UseLUNReset
These two parameters, taken together, determine the type of SCSI reset. The following table shows the available types:

Reset-table
*The default is LUN Reset.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

VAAI Offload Failures & the role of the VMKernel Data Mover

Before VMware introduced VAAI (vSphere Storage APIs for Array Integration), migrations of Virtual Machines (and their associated disks) between datastores was done by the VMkernel Data Mover (DM).

The Data Mover aims to have a continuous queue of outstanding IO requests to achieve maximum throughput. Incoming I/O requests to the Data Mover are divided up into smaller chunks. Asynchronous I/Os are then simultaneously issued for each chunk until the DM queue depth is filled. When a request completes, the next request is issued. This could be for writing the data that was just read, or to handle the next chunk.

Take the example of a clone of a 64GB VMDK (Virtual Machine Disk file). The DM is asked to move the data in 32MB transfers. The 32MB is then transferred in "PARALLEL" as a single delivery, but is divided up into a much smaller I/O size of 64KB by the DM, using 32 threads at a time. To transfer this 32MB, a total of 512 I/Os of size 64KB is issued by the DM.

By comparison, a similar a 32MB transfer via VAAI issues a total of 8 I/Os of size 4MB (XCOPY uses 4MB transfer sizes). The advantages of VAAI in terms of ESXi resources is immediately apparent. 

The decision to transfer using the DM or offloading to the array with VAAI is taken upfront by looking at storage array Hardware Acceleration state. If we decide to transfer using VAAI and then encounter a failure with the offload, the VMkernel will try to complete the transfer using the VMkernel DM. It should be noted that the operation is not restarted; rather it picks up from where the previous transfer left off as we do not want to abandon what could possibly be very many GB worth of copied data because of a single transient transfer error.

If the error is transient, we want the VMkernel to check if it is ok to start offloading once again. In vSphere 4.1, the frequency at which an ESXi host checks to see if Hardware Acceleration is supported on the storage array is defined via the following parameter:

 # esxcfg-advcfg -g /DataMover/HardwareAcceleratedMoveFrequency
Value of HardwareAcceleratedMoveFrequency is 16384

This parameter dictates how often we will retry an offload primitive once a failure is encountered. This can be read as 16384 * 32MB I/Os, so basically we will check once every 512GB of data move requests. This means that if at initial deployment, an array does not support the offload primitives, but at a later date the firmware on the arrays gets upgraded and the offload primitives are now supported, nothing will need to be done at the ESXi side – it will automatically start to use the offload primitive.

HardwareAcceleratedMoveFrequency only exists in vSphere 4.1. In vSphere 5.0 and later, we replaced it with the periodic VAAI state evaluation every 5 minutes.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

ESXi host connected to multiple storage array – is it supported?

The primary aim of this post is to state categorically that VMware supports multiple storage arrays presenting targets and LUNs to a single ESXi host. This statement also includes arrays from multiple vendors. We run with this configuration all the time in our labs, and I know very many of our customers who also have multiple arrays presenting devices to their ESX/ESXi hosts. The issue is that we do not appear to call this out in any of our documentation, although many of our guides and KB articles allude to it.

Some caution must be shown however.

Continue reading

Troubleshooting Storage Performance in vSphere – Storage Queues

Storage Queues what are they and do I need to change them?

We have all had to wait in a line or two in our life, whether it is the dreaded TSA checkpoint line at the airport or the equally dreaded DMV registration line, waiting in line is just a fact of life. This is true in the storage world too; storage I/O’s have plenty of lines that they have to wait in. In this article, we examine the various queues in the virtualized storage stack and discuss the when, how, and why of modifying them. 

Continue reading

Low Level VAAI Behaviour

We’re getting a lot of queries lately around how exactly VAAI behaves at the lower level. One assumes more and more VMware customers are seeing the benefit of offloading certain storage intensive tasks to the array. Recently the questions I have been getting are even more in-depth. I’ve been back over my VAAI notes gathered since 4.1, and have put together the following article. Hope you find it useful.

Continue reading

Troubleshooting Storage Performance in vSphere (Part 3) – SSD Performance

While presenting the storage performance talks, I frequently get asked about Solid State Device (SSD) performance in a virtualized environment. Well obviously, SSD’s or EFD’s (Enterprise Flash Disks) are great for performance especially if you have storage intensive workloads. As seen in the previous post in this series, SSDs can provide significantly more IOPs and significantly lower latencies. But the two big questions are ”how much of a gain might I expect” and “how much SSD storage do I need to achieve that gain” when using SSDs in a virtualized environment.

Continue reading