VMware

« July 2008 | Main | October 2008 »

August 08, 2008

Top Tips for Deploying VI, part 2

 1. If you have an active/passive FC storage array (most mid-range arrays fall into this bucket), be careful about setup. Firstly, be sure to have redundant paths from FC switches to your arrays’ storage processors. Secondly, be sure to use “MRU” (the default) for the path-selection policy and not “fixed”.

The best way to explain the first issue is with a picture.  What’s wrong with the following configuration?

Vitips21

Although you might believe that you have full redundancy between the hosts and the switches, and specifically that you can survive one HBA failure on each host, the reality is that you don’t have enough redundancy.  Here’s one failure scenario that won’t be handled properly:

Vitips22

The reason is that, with active/passive storage arrays, a given LUN can only be presented on one storage processor at a given time.   The LUN can shift from one storage processor to another, but such a shift takes many seconds (potentially up to 30 seconds).   If both HBA’s have failed (as in the above diagram), then the ESX hosts won’t be able to access to the same LUN at the same time.  Host 1 attempts to access the LUN on storage processor 1; host 2 attempts to access the same LUN on storage processor 2; and you end up with a ping-pong effect, or a “path thrashing” effect due to the active/passive array shifting the LUN back and forth between the two storage processors.  Performance of VM’s on both hosts will be erratic and penalized.

The solution is simple: create redundant connections from the FC switches to the array storage processors, as shown below.

Vitips23

There is a second noteworthy issue with active/passive arrays related to this same path thrashing effect: make sure that you use the “MRU” path selection policy (the default) rather than the “fixed” path selection policy.  If you use “fixed”, you may make the mistake of forcing the use of a particular storage processor for one host… but a different storage processor for another host… and thus end-up in a similar LUN ping-pong or path thrashing situation.

For more details about path thrashing see, the SAN Configuration Guide.

2. When configuring your VI environment for VMotion, make sure that your physical network switches are configured properly; in particular, make sure that each port has the right network (e.g. VLAN) visibility. 

VMotion requires that the destination ESX host have similar network connectivity to the source ESX host (so that, for example, the VM can continue access to its assigned VLAN after the VMotion).  VirtualCenter checks for correct virtual switch configuration on the source and destination ESX; however, VirtualCenter does not for correct configuration of the physical network switches.  In a larger VI deployment where many network switch ports are involved, a single misconfiguration of a single physical switch port can be hard to detect.  The symptom will be as follows: when the particular VM relying on a particular VLAN id VMotion migrates to the particular ESX host with the misconfigured switch port, the VM loses all network connectivity.   Solution: when adding new ESX hosts to a network, take the time to double-check your network switch port configurations to make absolutely sure that all the VLANs are correctly configured.

3. When using VMware HA, take note of how memory reservations are specified and used to reserve cluster failover capacity.  Using more consistent reservations or disabling admission control are both appropriate workarounds if the calculations are overly conservative in your environment.

How VMware HA works: If a VMware ESX host fails, VMware HA will restart the VMs affected by that failure on alternate hosts in the cluster.  In order to do so, HA must reserve failover capacity within the cluster.  HA currently achieves this by implementing an “admission control” policy that prevents (or warns against) the powering on of VMs that would encroach upon the failover capacity being reserved.  In some cases, however, the admission control calculations may be too conservative.

Example scenario: Suppose you have 19 VMs, each with a 300 MB memory reservation.  To power-on all of these VM's, you need 5.7GB of RAM (=19*0.3) (total within the cluster, after allocating space for potential host failures, and not accounting for memory sharing in ESX).  Since all reservations are equivalent, HA defines an average VM to require 300 MB of memory.

Now, let's suppose you power-on a 20th VM with a 2 GB memory reservation.  Instead of calculating memory requirements as 7.7 GB (=19 x 0.3 + 1 x 2), HA takes a more conservative approach and redefines the average VM to be the biggest reservation observed.   With the higher reservation specified, HA will cautiously assume that every VM need 2 GB of memory, and will ask for 40GB (=20*2) of RAM to be set aside for total runtime and failover capacity within the cluster.  These calculations are intended to be conservative to ensure that sufficient spare capacity is available, without fragmentation across hosts within a cluster.

In many cases (such as clusters with widely varying sizes of hosts and VMs), however, these calculations can be more conservative than desirable, and can lead to “insufficient failover capacity” warnings when powering on more VMs.

Two potential approaches are recommended if you are observing these warnings, or want to avoid them within a heterogeneous cluster configuration:

Approach 1: Either lower the reservations on your most demanding VM’s, or remove the reservations skewing the calculations and rely upon “shares” instead.  See the resource management guide for differences between reservations and shares.

Approach 2:  Alternatively, configure HA to disable strict admission control.  Host failures will still be detected and acted upon, but VMware HA will not prevent the starting of new VMs due to insufficient failover capacity.

4. When sizing your LUNs, a medium-sized LUN (~500GB) seems best for most situations.   

Small LUN’s (and VMFS volumes) can result in SAN management complexity (too many LUNs to manage).  Very large LUN’s can result in performance issues, too coarse a granularity for troubleshooting and performance tuning, and failure/error isolation.  The below chart summarizes some of the considerations.  Details are provided on page 72 of the VI 3 SAN Design Guide.

 Smaller LUN /
VMFS volume
100GB
Medium-sized LUN /
VMFS volume
500GB
Larger LUN /
VMFS volume
3TB
VMFS: Metadata overhead Some overhead (0.5%) Negligible overhead (<0.1%) Negligible overhead (<0.1%)
Impact of a failure or error, difficulty of troubleshooting Affects a few VM's Affects 20-30 VM's Affects many VM's
Ease of SAN mgmt Hard (many LUN's to manage) Medium Easy (just 1 LUN to manage)
Ease of tuning performance (**) High (tunable per the few VM's on a LUN) Medium (tunable for 20-30 VM's at a time) Low (one setting for many, many VM's)
Flexibility in specifying value-added services  (***) High (different LUNs can have different policies or settings) Medium (tunable for 20-30 VM's at a time) Low (many VMs share the same policies or settings)

(*) File creation in VMFS grabs a SCSI lock on the LUN.  Excessive concurrent file creation in VMFS can cause lock contention, which can hurt performance.  This can be apparent if multiple users are concurrently creating VM’s (and therefore VMFS files), or when a VCB-based backup process is concurrently backing up multiple VM’s (and is therefore concurrently creating multiple VMFS REDO files)
(**) e.g. RAID-level, array caches, queue depths, path selection/path dedication
(***) e.g. Backup, other data protection features such as replication, mirroring, etc., capacity optimization features such as de-dupe, thin-provisioning, etc., security and encryption features

See also Top Tips for Deploying VI, part 1

--The VI Team

August 01, 2008

Interesting items in Update 2 for VMware Infrastructure 3.5

VMware Infrastructure 3.5 U2 now available

 

After all the dramatic news from VMware over the last month or so, it may feel like the availability of VMware Infrastructure 3.5 Update 2 is not particularly newsworthy but there are a few things quietly being delivered that merit a good deal of attention.

 

Enhanced VMotion compatibility. (EVC)

 

Previous releases of VMware Infrastructure restricted VMotion between processors belonging to different generations even if they were from the same manufacturer. These restrictions were put in place to ensure that a consistent CPU feature set was always exposed to software.

 

KB Articles 1991 (Intel) and 1992 (AMD) describe the current compatibility groups for VMotion. KB article 1993 described methods for masking select features to relax the compatibility requirements.

 

 

The reason I think Enhanced VMotion is something VMware users should care about is because it radically simplifies the process of determining VMotion compatibility.  

 

VMware worked closely with AMD and Intel on the specification for AMD-V Extended Migration and Intel FlexMigration technologies which are used to make newer generation CPUs backward compatible with older CPU generations. 

 

With EVC, it is now much easier to add newer generation hardware into your existing VMware infrastructure while maintaining VMotion compatibility between the new and the older hardware. This makes adding new ESX hosts and retiring older hosts easier since you no longer need to worry much about CPU VMotion compatibility. Best of all, it’s really simple. No complicated compatibility matrices, and no CPU masks! Woohoo!

 

Processors included in the new enhanced VMotion compatibility for Intel are:

 

  • Quad-Core Intel® Xeon® processor 7300
  • Quad core Intel Xeon processor 5100/5200/5300/5400 series, based on the Intel® Core™ microarchitecture
  • Future Xeon processors based on Enhanced Intel® Core™ Microarchitecture.

For those familiar with Intel code names, these are Intel Core 2 (Merom) based processors and Intel Core 2 Duo (Penryn) based processors. All future processors from Intel with Intel VT FlexMigration will be VMotion compatible as well.

 

Processors included in the new enhanced VMotion compatibility for AMD are:

 

  • First-Generation AMD Opteron ™ Rev. E  
  • AMD Second-Generation AMD Opteron
  • Third-Generation AMD Opteron as well as future AMD Opteron™ processors.

* This is documented on pg 239 of the Basic Systems Administration Guide for VMware ESX 3.5 U2. We're still trying to get the processor marketing names into this doc. Please comment on this blog if you use marketing names to identify your processor...and think this is a worthwhile exercise.

 

Although EVC makes great strides in enabling VMotion across multiple CPU generations from the same vendor, it is not possible to VMotion from AMD processors to Intel processors or vice versa. Will it ever happen? Anyone want to place bets?

 

 

VSS Quiescing for Windows Applications:

 

VCB is possibly the least understood component of VMware Infrastructure. So one might wonder why I’m making a big deal with VCB 1.5.


VCB 1.5 now uses new components inside the updated VMware Tools package to provide application level quiescing(Windows 2003 VMs) in addition to filesystem level quiescing. This means that, if you are using backup products integrated with VCB to backup virtual machines, the snapshots of virtual machines will now have the assured application consistency if the applications running inside support VSS. Significant performance optimizations also make the backup process much faster with this version of VCB.

 

 

Monitoring and availability enhancements

 

With Update 2, VMware ESXi (yes, the free one) has been enriched to provide better hardware health information for Qlogic, Emulex, LSI components in your server as well as additional asset information used by HP Insight Manager. Overall, this augments the manageability of ESXi as we continue to deepen the information it can gather about the underlying hardware. Some of our larger customers are now standardizing on ESXi because they love its small footprint and low maintenance (they’re standardizing on ESXi as the hypervisor and VMware Infrastructure to still provide the VMotion, DRS, HA etc). Now the enhanced manageability had taken away all barriers for them!

 

With VMware Infrastructure 3.5, we had introduced experimentally, a feature for virtual machine failure monitoring with VMware HA. This feature uses VMware Tools to monitor the operating system inside the virtual machine and can be configured to restart the VM in the event of failures or crashes of the OS. With Update 2, we now fully support this feature!


Why am I picking out this one as a notable feature? Well, we got a lot of bad press about releasing experimental features with 3.5 and one by one, we’re moving to full support on them. Slowly but surely.

 

 

VMware Infrastructure 3.5 Update 2 has many more interesting features – such as live cloning for virtual machines, guided consolidation enhancements and the much awaited support for Windows 2008 editions…read the release notes to find out more!