Home > Blogs > VMware vSphere Blog > Monthly Archives: September 2011

Monthly Archives: September 2011

VAAI Thin Provisioning Block Reclaim/UNMAP Issue

A performance issue has recently been discovered in the VAAI Thin Provisioning Block Reclaim mechanism introduced in vSphere 5.0. This feature allows an ESXi host to tell the array that a block on a Thin Provisioned datastore is no longer needed and can be reclaimed. This could be the result of a file delete operation, a Storage vMotion, a Snapshot Consolidate operation, etc. I blogged about the reclaim feature some time ago.

What we have found is that the time taken to complete the Block Reclaim operation could perform poorly.

Unfortunately, in light of this performance issue, VMware are advising that the UNMAP feature be disabled in ESXi 5.0 until further notice.

A KnowledgeBase article, 2007427, has now been published and contains an additional description of the issue and details on how to disable the UNMAP feature. Click here to read the KB.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

Auto Deploy Plug-in Error with the vCenter Server Appliance (VCSA)

Kyle Gleed, Sr. Technical Marketing Manager, VMware

I love the vCenter Server Appliance (VCSA) and have been using it exclusively since early in the 5.0 beta.  Unfortunately, I’ve been a bit negligent about upgrading the VCSA server in my lab (still running a very old pre-release build) so I recently bit the bullet and updated to the official 5.0 GA release.  Everything went well with the exception of an error I ran into when trying to register the Auto Deploy plug-in.  When I tried to register using the vSphere client I ran into a connection error:


I spent more time than I'd like to admit going over my setup and checking firewalls, routing, DNS and couldn’t come up with anything.  So I eventually went as far as installing a second instance of the VCSA in a completely different environment and viola – same problem :(

Long story short, I finally tracked the error down to a service that wasn't running on the VCSA.  When you first start the appliance it starts all the services except for the Auto Deploy service:


I suspect this may be intentional (and possibly even documented somewhere,  although I didn't find it in any of my searches) as not everyone running the appliance will use Auto Deploy, but for me it definitely wasn't expected.  What I learned is that this service needs to be running and listening on port 6502 in order to enable the Auto Deploy plug-in.  The fix was a matter of simply clicking the “start ESXi Services” button and the Auto Deploy service  started right up.


I then went back to the vSphere client and enable the plug-in without issue:


The good news to all of this, besides fixing my issue, is that during my troubleshooting efforts I came across this very handy KB article on how to troubleshoot Auto Deploy.


#VMworld 2011 – Copenhagen – #VSP1700, #GD21, #GD47

Although it feels like VMworld 2011 in Las Vegas has only just finished, the European conference is already almost on us. I thought I'd put together a short note on what I'm involved in at VMworld 2011 in Copenhagen.

#VSP1700 – vSphere 5.0 Storage Features co-presented with Paudie O'Riordan, Storage Escalation Engineer with EMEA GSS. This seemed to go quite well at VMworld Las Vegas, and covers pretty much everything new in the storage space that came out in vSphere 5.0. Currently scheduled for Thursday at 12:30pm.

#GD21 – Group Discussion on Storage Best Practices, again with Paudie. I really like these, and the ones in Las Vegas seemed to go very well with a lot of audience participation/interaction. There is no  set agenda for these group discussions, and it is basically a round-table discussion about all things storage. It is powerpoint free (for the most part). And of course you'll have Paudie telling you about the horror stories he's seen in support! Not to be missed. :-) Currently scheduled for Tuesday at 9.00am.

#GD47 – Group Discussion on the vSphere Storage Appliance (VSA), with Tushar Shanbhag, Product Manager for the VSA. We didn't have this in Las Vegas, so its really cool that its in Copenhagen. I've done a lot of postings on the VSA in the past, and it is now getting some real traction in our customer base. This is also an opportunity to not only learn about the VSA, but also influence the direction of the product. Tushar is the guy who puts together the VSA roadmap, so you might also learn about some cool new VSA features planned in forthcoming releases. Currently scheduled for Wednesday at 4.30pm.

Experts 1:1 – Spend 15 minutes with me, talking about all things storage. No fixed agenda so you can bring along any topic of your choosing. Very informal. Currently scheduled for Tuesday at 10.30am.

You can check out the complete VMworld 2011 Copenhagen catalog here – https://vmworldeurope2011.wingateweb.com/scheduler/newCatalog.do. Hope to see you there.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

vSphere 5.0 CLI Reference Poster

Kyle Gleed, Sr. Technical Marketing Manager, VMware

During VMworld 2011 in Las Vegas we handed out some cool PowerCLI and vCLI reference posters.  They were a hot commodity and just in case you missed out you can get the PDF version from the link below.  You can also get the PowerCLI poster here.

Download VMware Management with vCLI 5.0





Storage I/O Control Enhancements in vSphere 5.0

Storage I/O Control (SIOC) was introduced in vSphere 4.1 and allows for cluster wide control of disk resources. The primary aim is to prevent a single VM on a single ESX host from hogging all the I/O bandwidth to a shared datastore. An example could be a low priority VM which runs a data mining type application impacting the performance of other more important business VMs sharing the same datastore.


Configuring Storage I/O Control

Let's have a brief overview of how to configure SIOC. SIOC is enabled very simply via the properties of the datastore. This is a datastore built on a LUN from an EMC VNX 5500:


The Advanced button allows you to modify the latency threshold figure. SIOC doesn't do anything until this threshold is exceeded. By default in vSphere 5.0, the latency threshold is 30ms, but this can be changed if you want to have a lower of higher latency threshold value:


Through SIOC, Virtual Machines can now be assigned a priority when contention arises on a particular datastore. Priority of Virtual Machines is established using the concept of Shares. The more shares a VM has, the more bandwidth it gets to a datastore when contention arises. Although we had a disk shares mechanism in the past, it was only respected by VMs on the same ESX host so wasn't much use on shared storage which was accessed by multipe ESX hosts. Storage I/O Control enables the honoring of share values across all ESX hosts accessing the same datastore.

The shares mechanism is triggered when the latency to a particular datastore rises above the pre-defined latency threshold seen earlier. Note that the latency is calculated cluster-wide. Storage I/O Control also allows one to tune & place a maximum on the number of IOPS that a particular VM can generate to a shared datastore. The Shares and IOPS values are configured on a per VM basis. Edit the Settings of the VM, select the Resource tab, and the Disk setting will allow you to set the Shares value for when contention arises (set to Normal/1000 by default), and limit the IOPs that the VM can generate on the datastore (set to Unlimited by default):

More information on Storage I/O Control can be found in this whitepaper.

SIOC Enhancements in vSphere 5.0

In vSphere 4.1, SIOC was supported for block storage devices (FC, iSCSI, FCoE) only. In vSphere 5.0, we have introduced support for NAS devices. This means that we now have a mechanism which will prevent a single VM/ESXi on an NFS datastore from hogging all the bandwidth to that datastore. Once again, you just select the properties of the NFS datastore to enable SIOC. Here is a screen-shot showing the SIOC properties for an NFS datastore presented to the ESXi hosts from a NetApp FAS3170A array:

SIOC also has a new use case in vSphere 5.0, and that is of course Storage DRS. SIOC is used to initially gather information about datastore capabilities and is also used for gathering I/O Metrics from the datastores in an SDRS datastore cluster.

Common question

One question about Storage I/O Control which often arises is the following; If you have two hosts with equal shares, they will have equal queue lengths, so why do you observe different throughput in terms of Bytes/s or IOPS?

The reason for this is due to differences in per-IO cost and scheduling decisions made within the array. The array may process requests in the order it thinks are the most efficient to maximize aggregate throughput, causing VMs will equal priority to display slightly different throughputs.

Extenal Workloads

SIOC can only work when it has insight into all the workloads on a datastore. If there are external workloads into which SIOC has no visibility, then an alarm 'Non-VI workload detected on the datastore' will be triggered. For SIOC to perform optimally, you will need to address the reason for this external workload, and prevent it from re-occurring. This KB offers some very good advice on the subject – http://kb.vmware.com/kb/1020651.

Best Practice/Recommendation

Storage I/O Control is a really great feature for avoiding contention and thus poor performance on shared storage. It gives you a way of prioritizing which VMs are critical and which are not so critical from an I/O perspective. I highly recommend enabling it in your environments if you can. I believe that Storage I/O Control is essential to provide better performance for I/O intensive and latency-sensitive applications such as database workloads, Exchange servers, etc.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

SRM Product Management Survey

The SRM Product Management Team has put together a customer survey to better understand current SRM deployments and to help drive future product direction.

We would really appreciate it if you could take a few minutes and complete the survey. It should only take about 10 minutes, and would really help us drive SRM to better suit your needs.  

Take part at the link below to help us make the best product possible!

SRM Product Management Survey

Storage DRS Affinity & Anti-Affinity Rules

By now you should be well aware that one of the major storage and resource management enhancements in vSphere 5.0 is Storage DRS. What was one of the motivations behind developing this feature? For some time we have had the Distributed Resource Scheduler (DRS) feature in vSphere, and this managed the initial placement and load balancing of virtual machines based on CPU and Memory utilization. However there was still the possibility that VMs could be placed on the same datastore, and even if that datastore was nearing capacity, or the VM performance was degrading, there was nothing in DRS that would prevent VMs being placed on this datastore. Storage DRS addresses this by selecting the best datastore for initial placement, and also uses Storage vMotion to migrate virtual machines between datastores when capacity or I/O latency is an issue.

In previous postings I already discussed initial placement and load balancing based on datastore capacity and I/O latency. However there is another cool feature of Storage DRS that I haven't yet discussed. These are the affinity and anti-affinity rules. These rules are conceptually very similar to the affinity and anti-affinity rules that you might find in DRS. The rules basically work by keeping VMs together on the same datastore or apart of different datastores, in much the same way that the rules in DRS kept VMs together on the same host or apart on separate hosts. In DRS, you might have separated out your primary and secondary DNS server using anti-affinity rules. In this way, if one ESX host failed & brought down one of the DNS servers, your DNS server stays running on another host in the cluster. However there was nothing to stop both the primary and secondary DNS servers residing on the same datastore, and if that datastore failed, so did both servers. Now with Storage DRS anti-affinity rules, you can keep these DNS servers (or any other primary/secondary servers) on different datastores.

However there is another significant feature of Storage DRS affinity & anti-affinity rules, and this is the ability to automatically keep Virtual Machine Disks (VMDKs) together on the same datastore or apart on different datastores. By default, VMDKs are placed together on the same datastore. So why might I want to place VMDKs on different datastores? Well, one example that I thought of was that some of our customers use in-Guest mirroring and raid volumes in the Guest OS. In this case, you would want to make sure that both the primary volume and its replica are kept on different datastores. If both sides of the mirror were on the same datastore, and that datastore failed, you would lose both sides of the mirror.


This is yet another reason why Storage DRS is one of the most highly regarded features in vSphere 5.0.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

How often does ESXi write to the boot disk?

Kyle Gleed, Sr. Technical Marketing Manager, VMware

A couple weeks back I wrote about booting from USB/SD and made the comment that ESXi will routinely save the host configuration to the boot device once every 10 minutes.  This generated a lot of questions and it quickly became evident that people wanted to understand more about how a running ESXi hosts interacts with the boot device, especially when booting off of USB/SD.  As such  I did some research and here’s a bit more detail on how frequently ESXi writes to the boot device. 

Outside of the initial installation, when the binaries are first laid out on the boot device, there are four scenarios where the ESXi configuration will be saved to the boot device:

1.  At shutdown.  Each time a host shutdown is initiated the /sbin/backup.sh script is invoked from the /etc/inittab.  This backup ensures any outstanding host configuration changes will be flushed to the disk and available on reboot.  Note that a backup only occurs when changes are found.  If there are no changes there is no backup.

2.  At a fixed rate of once an hour.  Once an hour the /sbin/auto-backup.sh scripts is invoked from the root user’s crontab.  This script first checks for any outstanding changes and if detected will invoke the /sbin/backup.sh script to perform a backup.  Again, a backup only occurs when changes are found.  If there are no changes there is no backup.

3.  On demand by ‘hostd’.  The ESXI ‘hostd’ will initiate an immediate backup following certain types of configuration changes, such as changing the root password or changing settings on the management network.  One thing to note about the ‘hostd’ initiated backups is that in an effort to avoid performing too many backups in short succession the ‘hostd’ backups are throttled to no more than 6 backups an hour.  If you make multiple changes in a short 20 minute window causing ‘hostd’ to reach it’s limit of 6, the ‘hostd’ will stop performing backups and the outstanding configuration changes will then get saved with the next hourly backup.

4.  By VMware HA.  VMware HA will also trigger a backup when HA is (re)configured and anytime there is a change in the cluster membership.  Like the on demand backups initiated by ‘hostd’, the VMware HA initiated backups are also throttled so they occur no more than once every 10 minutes.


Trying to narrow down a specific number of times the ESXi host will write to the boot device is difficult as there are many factors to consider.  In a worst case scenario, ESXi has the potential to write its configuration at a rate of once every 10 minutes.  However, the reality is that in most cases it will write far less.  In a stable environment where there are relatively few changes being made you could go several hours, if not days, without having any I/O to the boot device (assuming your logs are on a separate device).  However, anytime you make configuration changes these updates will get saved either immediately via ‘hostd’ or at the next scheduled hourly backup.  Also, anytime there is a change in the HA configuration (host placed in/out of maintenance mode for example) or there is a change in cluster membership additional backups will occur. 

vSphere Storage Appliance (VSA) Resilience – Network Outage Scenario #2: Front End

In a previous post, I demonstrated what would happen in a vSphere Storage Appliance (VSA) Cluster if we lost the back-end (cluster communication/replication) network. In this post, I want to look at what would happen in the event of losing the front-end (management/NFS) network.

To cause this failure, I'm going to do the same steps that I carried out in the previous blog, namely downing the uplinks on one of the ESXi hosts/VSA Cluster nodes, but this time I will be doing it to the vmnics associated with the front-end network. For an overview of the VSA cluster networking, please check my previous post which explains it in detail.

The configuration is exactly the same as before, with a 3 node VSA cluster presenting 3 distinct NFS datastores. Once again I will have a single VM, running on host1, but using the NFS datastore exported from the appliance running on host3. As before, I will cause the outage on the ESXi host (host3) which hosts the appliance/datastore on which the WinXP VM resides. The ESXi host (host1) where WinXP is running will not be affected.

To begin, lets start downing the first interface on host3:

Once again, while nothing externally visible happens when one of the teamed uplinks is downed, internally, this will cause the three port groups associated with the vSwitch (i.e., VSA-Front End, VM Network, and Management Network) to utilize the same active uplink since the NIC team (either the VSA-Front End port group or both the VM Network and Management Network will fail-over to its previously configured passive uplink).  Until the failed NIC is restored to health, the network traffic for the three port groups will share the same single active uplink. Let's now bring down the second interface on host3.

As I am sure you know by now, the VSA installer places all ESXi hosts that are VSA cluster members into a vSphere HA Cluster. This provides a way for Virtual Machines to be automatically restarted if the ESXi host on which they were running goes down. Now, since I've just downed the uplinks for the management interfaces on one of the ESXi hosts, the vSphere HA agent that runs on the ESXi will not be able to communicate to the other hosts in the vSphere HA cluster or with the vCenter server. Therefore the first thing you see when the front-end network is lost are complaints from vSphere HA that it cannot communicate with the HA agent on that ESXi host (I've expanded the vSphere HA State detail bubbles in the screen-shots below to show more verbose messages):


This is shortly followed by a Host Failed event/status from vSphere HA:

Since our VM (WinXP) is not running on the ESXi host (host3) which failed, vSphere HA will not attempt to restart that VM on another host in the cluster.

Eventually, since vCenter communication to the vpx agent is also via the front-end network, this ESXi and the VSA appliance running on that host become disconnected from vCenter:

Now, because cluster communication is all done over the back-end network, and this network is unaffected by the outage, the VSA Cluster will not take any corrective action in this case. It continues to export all NFS datastores from all appliances. Therefore there is no need for another appliance to take over the presentation of the datastore from the appliance that is running on the ESXi host that has the front-end network outage. Let's look now at the datastore from the VSA Manager UI:


From a VSA cluster perspective, the datastores are online. All appliances also remain online:

But because the front-end network is now down, the datastore can no longer be presented to the ESXi hosts in the cluster. This is because the front-end network is used for NFS exporting by the appliances. What this means is that all of the ESXi hosts in the datacenter have lost access to the datastore (inactive):

And when we look at the VM, we also see that the datastore is shown as inactive in the Summary tab:

Basically, what is happening in this network outage is that the VSA Cluster remains intact and functional (replication & heartbeating continues), but the front-end network outage is preventing it from presenting the NFS datastore(s) to the ESXi hosts.

The Virtual Machine will remain in this state indefinitely until the datastore comes back online. This is unique feature of Virtual Machines if the underlying disk 'goes away' – they will retry I/O indefinitely until the disk becomes available again. This has saved the skin of many an admin who inadvertently removed the incorrect datastore from a host. When they realise their mistake, they re-present the datastore back and the VMs suffer no outage, picking up where they left off. One caveat though – just because the Guest OS can survive this sort of outage, there is no guarantee that the application running inside the Guest OS will.

And that is basically it – a front-end network outage on an ESXi host in the cluster will mean that that datastore becomes unavailable for the duration of the outage. The VMs will retry I/Os for the duration of the outage, and when the network issue is addressed & the datastore comes back online, the VMs will resume from the point where the outage occurred. The point is that the cluster framework is unaffected.

If we bring the uplinks for the front end network back up…

…the VM resumes from where it was before the outage:

There is no  synchronization needed since the datastores remaining mirrored via the Back End network.

This leads me on to a question for you folks out there thinking about deploying the VSA – do you think this behaviour is optimal? Or would you prefer to see the behaviour similar to the back-end network outage, i.e. failing over the NFS datastore to an alternate appliance? Please use the comments field to provide feedback. It is something that is being debated internally, and we'd love to hear your thoughts.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter VMwareStorage

Using the Compatibility Guide to check Guest OS support on ESXi

I recently learned about an issue with the VMware Compatibility Guide that is creating some confusion.  The problem (demonstrated in the screen shot below) is when when using the compatibility guide to search for Guest OS support on ESXi there is no option for selecting versions prior to 5.0.


What happened is prior to vSphere 5.0 the Guest OS support for both ESX and ESXi was listed under a single product name of “ESX”.  However, with 5.0 a new product name was added called "ESXi".  An unintended side affect of doing this is now when selecting ESXi you only see ESXi 5.0 and not the earlier versions.

To help clear up any confusion, the Guest OS support for a given release is the same for both ESX and ESXi.  For example, to list the supported Guest OS for ESXi 4.1, you query for ESX 4.1.  Starting with 5.0, which is ESXi only, you will need to use the product name “ESXi”.

I hope this will help clear up any confusion.  Hopefully, we'll see an update to the website to help address this as well.