Home > Blogs > VMware vSphere Blog > Monthly Archives: November 2011

Monthly Archives: November 2011

New Fling to help with migrating to ESXi

Kyle Gleed, Sr. Technical Marketing Manager, VMware

For folks who are still running the classic ESX hypervisor there is a new fling available to help smooth your migration to ESXi.  The ESX System Analyzer scans existing ESX hosts looking for issues that could affect your transition to ESXi.  It produces a nice report that you can use to assess your readiness and identify any potential risks that you should be aware of before moving to ESXi.  Check it out, I think you'll like it. 

The ESX System Analyzer is a tool designed to help administrators plan a migration from ESX to ESXi. It analyzes the ESX hosts in your environment and, for each host, collects information on factors that pertain to the migration process:

  • Hardware compatibility with ESXi
  • VMs registered on the ESX host, as well as VMs located on the host’s local disk
  • Modifications to the Service Console
    • RPMs which have been added or removed
    • Files which have been added
    • Users and cronjobs which have been added

This tool also provides summary information for the whole existing environment

  • Version of VMware Tools and Virtual Hardware for all VMs
  • Version of Filesystem for all datastores

By having this information, administrators can determine what tasks need to be done prior to the migration. Examples include:

  • Relocate VMs from local datastores to shared datastores
  • Make note of what agent software has been added to the host and obtain the equivalent agentless version
  • Replace cronjobs with equivalent remote scripts written with PowerCLI or vCLI

 

Protecting your ESXi Images using VIB Acceptance Levels

Kyle Gleed, Sr. Technical Marketing Manager, VMware

Duncan Epping recently posted some great info on creating custom VIB files (How to create your own .vib filesSome more nuggets about handling vib files).  With custom VIBs making their way into the community this got me thinking that a quick refresher on VIB security would be helpful.  A while back I posted a VIB overview blog in which I discussed how signature files are used to help not only identify if a VIB is officially supported, but also to protect the against any malicious tampering of the VIBs contents.  While custom VIBs definitely have their place (see KB 2007381), do exercise caution when adding them to your ESXi image profiles.  Here's a quick recap of the section on VIB security:

The signature fileis an electronic signature used to verify the level of trust associated with the VIB.  The acceptance level not only helps protect the integrity of the VIB, but it also identifies who created the VIB and the amount of testing and verification that has been done. There are four acceptance levels:

  • VMwareCertified:  VIBs created and tested by VMware.  VMware Certified VIBs undergo thorough testing by VMware.
  • VMwareAccepted:  VIBs created by a VMware partners that are approved by VMware.  VMware relies on partners to perform the testing, but VMware verifies the results.
  • PartnerSupported:  VIBs created and tested by a trusted VMware partner.  The partner performs all testing.  VMware does not verify the results.
  • CommunitySupported:  VIBs created by individuals or partners outside of the VMware partner program.  These VIBs do not undergo any VMware or trusted partner testing and are not supported by VMware or its partners. 

All VMware and partner supported VIBs must be signed by a VMware trusted authority, this helps ensure the security of the VIB by preventing any unauthorized tampering of its contents.   Community supported VIBs do not need to be signed, but they are still required to have an empty signature file.  Be careful when using CommunitySupported VIBs as their contents are not tested, monitored or controlled. 

Coinciding with the VIB acceptance levels, ESXi Image Profiles also have an acceptance level.  When the image is created it is assigned one of the four acceptance levels.  Any VIBs added to the image must be at the same acceptance level or higher.  This helps ensure that non-supported VIBs don’t get mixed in with supported VIBs when creating and maintaining ESXi images.

VDS Best Practices – Rack Server Deployment with Eight 1 Gigabit adapters (Part 3 of 6)

Rack Server in Example Deployment

After looking at the major components in the example deployment and key virtual and physical switch parameters, let’s take a look at the different types of servers that customers can have in their environment. Customers deploy ESXi host either on a Rack Server or a Blade Server. This section discusses the deployment where the ESXi host is running on Rack server. Two types of Rack server configuration will be described in the following section

  • Rack Server with Eight 1 Gigabit Ethernet network adapters
  • Rack Server with Two 10 Gigabit Ethernet network adapters

For each of the above two configurations, the different VDS design approaches will be discussed.

Rack Server with Eight 1 Gigabit Ethernet network adapters

In a Rack Server deployment with eight 1Gigabit Ethernet network adapters per host, customers can either use traditional static design approach of allocating network adapters to each traffic type or make use of advanced features of VDS such as Network I/O Control (NIOC) and Load Based Teaming (LBT). The NIOC and LBT features help provide a dynamic design that utilizes I/O resources efficiently. In this section both the traditional and new design approaches are described along with their pros and cons.

 

Design Option 1 – Static configuration

This design option follows the traditional approach of statically allocating network resources to the different virtual infrastructure traffic types. As shown in the Figure 1, each host has eight Ethernet network adapters and four of those network adapters are connected to one of the first Access layer switches while the other four network adapters are connected to the second Access layer switch to avoid single point of failure. Let’s take a look in detail how VDS parameters are configured.

8gig_deployment

Figure 1 Rack Server with eight 1 Gigabit Ethernet network adapters

dvuplink configuration

To support the maximum eight 1 Gigabit Ethernet network adapters per host, the dvuplink port group is configured with eight dvuplinks (dvuplink1…. dvuplink8). On the hosts the dvuplink1 is associated with vmnic0 and dvuplink2 is associated with vmnic1… so on. It is a recommended practice to change the names of the dvuplinks to something meaningful and easy to track. For example, dvuplink1 that gets associated with vmnic on a motherboard can be renamed as “LOM-uplink1”.

If the hosts have some Ethernet network adapters as LAN On Motherboard (LOMs) and some on expansion cards, then for a better resiliency story, VMware recommends to select one network adapter from LOM and one from expansion card while configuring NIC teaming. To configure this teaming on a VDS, administrators have to pay attention to the dvuplink and vmnic association along with dvportgroup configuration where NIC teaming is enabled. In the NIC teaming configuration on a dvportgroup, administrators have to choose the different dvuplinks that are part of a team. If the dvuplinks are named appropriately according to the host vmnic association, administrators can select “LOM-uplink1” and “Expansion-uplink1” while configuring the teaming option for a dvportgroup.

 dvportgroups configuration

As described in the Table 1 there are five different portgroups that are configured for the five different traffic types. Customers can create up to 5000 unique portgroups per VDS. In this example deployment, the decision of creating different portgroups is based on the number of traffic types.

According to the Table 1, dvportgroup PG-A is created for the management traffic type. There are other dvportgroups defined for the other traffic types. Following are the key configurations of dvportgroup PG-A:

  • Teaming Option: Explicit Failover order provides a deterministic way of directing traffic to a particular uplink. By selecting dvuplink1 as an Active uplink and dvuplink2 as standby uplink, the management traffic will be carried over dvuplink1 unless there is a failure on dvuplink1. Note that all other dvuplinks are configured as unused. It is also recommended to configure the failback option to “No” to avoid the flapping of traffic between two network adapters. The failback option determines how a physical adapter is returned to active duty after recovering from a failure. If failback is set to No, a failed adapter is left inactive even after recovery until another currently active adapter fails, requiring its replacement.
  • VMware recommends isolating all traffic types from each other by defining separate VLAN for each dvportgroup.
  • There are several other parameters that are part of the dvportgroup configuration. Customers can choose to configure these parameters based on their environment needs. For example, Customers can configure PVLAN to provide isolation when there are limited VLANs available in the environment.

As you follow the dvportgroups configuration in Table 1, you can see that each traffic type is carried over a specific dvuplink except the virtual machine traffic type that has two active uplinks dvuplink7 and dvuplink8. Virtual machine traffic type uses two active links, and these links are utilized through the load based teaming (LBT) algorithm. As mentioned earlier, LBT algorithm is much more efficient in utilizing link bandwidth than the standard hashing algorithm.

Table 1 Static Design configuration

Traffic Type

Port Group

Teaming Option

Active Uplink

Standby Uplink

Unused Uplink

Management

PG-A

Explicit Failover

dvuplink1

dvuplink2

3,4,5,6,7,8

vMotion

PG-B

Explicit Failover

dvuplink3

dvuplink4

1,2,5,6,7,8

FT

PG-C

Explicit Failover

dvuplink4

dvuplink3

1,2,5,6,7,8

iSCSI

PG-D

Explicit Failover

dvuplink5

dvuplink6

1,2,3,4,7,8

Virtual Machine

PG-E

LBT

dvuplink7/

dvuplink8

None

1,2,3,4,5,6

 

Physical switch configuration

The external physical switch, where the rack servers’ network adapters are connected to, is configured with trunk configuration with all the appropriate VLANs enabled. As described in the physical network switch parameters section, the following switch configurations are performed based on the VDS setup described in Table 1.

  • Enable STP on the trunk ports facing ESXI hosts along with “port fast” mode and “bpdu” guard.
  • The teaming configuration on VDS is static and thus no link aggregation is configured on the physical switches.
  • Because of the mesh topology deployment as shown in Figure 1, the link state-tracking feature is not required on the physical switches.

In this design approach, resiliency to the infrastructure traffic is achieved through Active – Standby uplinks and security is accomplished by providing separate physical paths for the different traffic types. However, with this design, the I/O resources are underutilized because the dvuplink2, dvuplink6 standby links are not used to send or receive traffic. Also, there is no flexibility to allocate more bandwidth to a traffic type when it needs it.

There is another variation to the static design approach that addresses some customer’s need of providing higher bandwidth to the storage and vMotion traffic type. In the static design described earlier, iSCSI and vMotion traffic is limited to 1 Gig. If a customer wants to support higher bandwidth for iSCSI, then they can make use of iSCSI multipathing solution. Also, with the release of vSphere 5, vMotion traffic can be carried over multiple Ethernet network adapters through the support of multi-NIC vMotion, and thus providing higher bandwidth to the vMotion process.

For more details on how to setup iSCSI multipathing please refer to the vSphere Storage guide link at the following website https://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html. The configuration of multi-NIC vMotion is quite similar to the iSCSI multipath setup, where administrators have to create two separate vmkernel interfaces and bind each one to a separate dvportgroup. The two separate dvportgroup configuration provides the connectivity to two different Ethernet network adapters or dvuplinks.

Table 2 Static design configuration with iSCSI mutipathing and multi-NIC vMotion

Traffic Type

Port Group

Teaming Option

Active Uplink

Standby Uplink

Unused Uplink

Management

PG-A

Explicit Failover

dvuplink1

dvuplink2

3,4,5,6,7,8

vMotion

PG-B1

None

dvuplink3

dvuplink4

1,2,5,6,7,8

vMotion

PG-B2

None

dvuplink4

dvuplink3

1,2,5,6,7,8

FT

PG-C

Explicit Failover

dvuplink2

dvuplink1

1,2,5,6,7,8

iSCSI

PG-D1

None

dvuplink5

None

1,2,3,4,6,7,8

iSCSI

PG-D2

None

dvuplink6

None

1,2,3,4,6,7,8

Virtual Machine

PG-E

LBT

dvuplink7/

dvuplink8

None

1,2,3,4,5,6

 

As shown in Table 2, there are two entries each for vMotion and iSCSI traffic type listing the additional dvportgroups configuration required to support the multi-NIC vMotion and iSCSI multipathing processes. For multi-NIC vMotion the dvportgroup PG-B1 and PG-B2 are configured with dvuplink 3 and dvuplink4 as active links respectively. And for iSCSI multipathing the dvportgroups PG-D1 and PG-D2 are connected to dvuplink5 and dvuplink6 as active links respectively. The load balancing across the multiple dvuplinks is performed by the multipathing logic in iSCSI process and by the ESXi platform in vMotion process. It is not required to configure the teaming policies for these dvportgroups.

The FT, Management, and Virtual Machine traffic types dvportgroup configuration and physical switch configuration for this design remains same as described in the design option 1 in previous section.

This static design approach improves on the first design by using the advanced capabilities such as iSCSI multipathing and multi-NIC vMotion. But at the same time this option has the same challenges related to underutilized resources and inflexibility in allocating additional resources on the fly to different traffic types.

Design Option 2 – Dynamic configuration with NIOC and LBT

After looking at the traditional design approach with static uplink configurations, let’s take a look at the VMware recommended design option that takes advantage of the advanced VDS features such as NIOC and LBT.

In this design the connectivity to physical network infrastructure remains same as described in the static design option but instead of allocating specific dvuplinks to individual traffic types, the ESXi platform utilizes those dvuplinks dynamically. To illustrate this dynamic design, each virtual infrastructure traffic type’s bandwidth utilization is estimated. In a real deployment, customers should first monitor the virtual infrastructure traffic over a period of time to gauge the bandwidth utilization, and then come up with bandwidth numbers for each traffic type.

Following are some bandwidth numbers estimated per traffic type:

  • Management Traffic (< 1 Gig)
  • vMotion (1 Gig)
  • FT (1 Gig)
  • iSCSI (1 Gig)
  • Virtual Machine (2 Gig)

Based on this bandwidth information, administrators can provision appropriate I/O resources to each traffic types using the NIOC feature of VDS. Let’s take a look at the VDS parameter configurations for this design as well as the NIOC setup. The dvuplink portgroup configuration remains same where eight dvuplinks are created for the eight 1 Gigabit Ethernet network adapters. The dvportgroup configuration is as follows.

dvportgroups configuration

In this design all dvuplinks are active and there are no standby and unused uplinks as shown in Table 3.  All dvuplinks are thus available for use by the teaming algorithm. Following are the key parameter configurations of dvportgroup PG-A:

  • Teaming Option: Load based teaming is selected as the teaming algorithm. With LBT configuration, the Management traffic initially will be scheduled based on the virtual port ID hash. Depending on the hash output, the management traffic is sent out over one of the dvuplink. Other traffic types in the virtual infrastructure can also be scheduled on the same dvuplink initially. However, when the utilization of the dvuplink goes beyond 75% threshold, the LBT algorithm will be invoked and some of the traffic will be moved to other underutilized dvuplinks. It is possible that Management traffic will be moved to other dvuplinks when such LBT event occurs.
  • The Failback option means going from using standby link to using active uplink after the active uplink comes back up in operation after a failure. This Failback option works when there are Active and Standby dvuplink configurations. In this design there are no Standby dvuplinks. So, when an active uplink fails, the traffic flowing on that dvuplink is moved to another working dvuplink. If the failed dvuplink comes back, the LBT algorithm will schedule new traffic on that dvuplink. This option is left default.
  • VMware recommends isolating all traffic types from each other by defining separate VLAN for each dvportgroup.   
  • There are several other parameters that are part of the dvportgroup configuration. Customers can choose to configure these parameters based on their environment needs. For example, Customers can configure PVLAN to provide isolation when there are limited VLANs available in the environment.

As you follow the dvportgroups configuration in the Table 3, you can see that each traffic type has all dvuplinks active and these links are utilized through the load based teaming (LBT) algorithm. Let’s now look at the NIOC configuration described in the last two columns of Table 3.

The Network I/O Control (NIOC) configuration in this design helps provide the appropriate I/O resources to the different traffic types. Based on the previously estimated bandwidth numbers per traffic type, the shares parameter is configured in NIOC shares column in Table 3. The shares values specify the relative importance of specific traffic type, and NIOC ensures that during contention scenarios on the dvuplinks each traffic type gets the allocated bandwidth. For example, a shares configuration of 10 for vMotion, iSCSI, and FT allocates equal bandwidth to these traffic types. While the Virtual Machines get the highest bandwidth with 20 shares and Management gets lower bandwidth with 5 shares.

To illustrate how share values translate to bandwidth numbers, let’s take an example of 1 Gigabit capacity dvuplink carrying all five traffic types. This is a worst-case scenario where all traffic types are mapped to one dvuplink. This will never happen when customers enable the LBT feature, because LBT will balance the traffic based on the utilization of uplinks. This example shows how much bandwidth each traffic type will be allowed on one dvuplink during a contention or oversubscription scenario and when LBT is not enabled.

  • Management: 5 shares;        (5/55) * 1 Gigabit = 90.91 Mbps
  • vMotion: 10 shares;               (10/55) * 1 Gigabit = 181.18 Mbps
  • FT: 10 shares;                          (10/55) * 1 Gigabit = 181.18 Mbps
  • iSCSI: 10 shares;                      (10/55) * 1 Gigabit = 181.18 Mbps
  • Virtual Machine: 20 shares; (20/55) * 1 Gigabit = 363.64 Mbps
  • Total shares: 5 + 10 + 10 + 10 + 20 = 55

To calculate the bandwidth numbers during contention, you should first calculate the percentage of bandwidth for a traffic type by dividing its share value by the total available share number (55). In the second step the total bandwidth of the dvuplink (1 Gigabit) is multiplied with the percentage of bandwidth number calculated in the first step. For example, 5 shares allocated to management traffic translate to 90.91 Mbps of bandwidth to management process on a fully utilized 1 Gigabit network adapter. In this example, custom share configuration is discussed but customer can make use of predefined High (100), Normal (50), and Low (25) shares while assigning them to different traffic types.

The vSphere platform takes these configured share values and applies them per uplink. The schedulers running at each uplink are responsible in making sure that the bandwidth resources are allocated according to the shares. In case of eight 1Gigabit Ethernet network adapter deployment, there are eight schedulers running. Depending on the number of traffic types scheduled on a particular uplink, the scheduler will divide the bandwidth among the traffic types based on the share numbers. For example, if only FT (10 shares) and Management (5 shares) traffic are flowing through dvuplink 5, then based on the shares value, FT traffic will get double the bandwidth of management traffic. Also, when there is no management traffic flowing, all bandwidth can be utilized by FT process. This flexibility in allocating I/O resources is the key benefit of NIOC feature.

The NIOC limits parameter of Table 3 is not configured in this design. The Limit value specifies an absolute maximum limit on egress traffic for a traffic type. Limits are specified in Mbps. This configuration provides a hard limit on any traffic even if I/O resources are available to use. It is not recommended to use limit configuration unless you really want to control the traffic even though additional resources are available.

There is no change in physical switch configuration in this design approach even with the choice of the new LBT algorithm. The LBT teaming algorithm doesn’t require any special configuration on physical switches. Please refer to the physical switch settings described in design option 1.

Table 3 Dynamic design configuration with NIOC and LBT

Traffic Type

Port Group

Teaming Option

Active Uplink

Standby Uplink

NIOC Shares

NIOC Limits

Management

PG-A

LBT

1,2,3,4,

5,6,7,8

None

5

-

vMotion

PG-B

LBT

1,2,3,4,

5,6,7,8

None

10

-

FT

PG-C

LBT

1,2,3,4,

5,6,7,8

None

10

-

iSCSI

PG-D

LBT

1,2,3,4,

5,6,7,8

None

10

-

Virtual Machine

PG-E

LBT

1,2,3,4,

5,6,7,8

None

20

-

 

One thing to note about this design is that it doesn’t provide higher than 1 Gigabit bandwidth to the vMotion and iSCSI traffic types, as is the case in one of the static design using multi-NIC vMotion and iSCSI multipathing. The Load based teaming algorithm cannot split the infrastructure traffic across multiple dvuplink ports and utilize all the links. So, even if the vMotion dvportgroup PG-B has all the eight 1 Gigabit Ethernet network adapters as active uplinks, vMotion traffic will be carried over only one of the eight uplink. The main advantage of this design is in the scenarios where the vMotion process is not using the uplink bandwidth, and other traffic types are in need of the additional resources. In these situations NIOC makes sure that the unused bandwidth is allocated to the other traffic types that need it.

This dynamic design option is the recommended approach because it takes advantage of the advanced VDS features and utilizes I/O resource efficiently. This option also provides Active-Active resiliency where no uplinks are in standby mode. In this design approach, customers allow the vSphere platform to make the optimal decisions on scheduling traffic across multiple uplinks.

Some customers who have restrictions in the physical infrastructure in terms of bandwidth capacity across different paths and limited availability of the layer 2 domain might not be able to take advantage of this dynamic design option. While deploying this dynamic option, it is important to consider all different traffic paths that a traffic type can take and make sure that the physical switch infrastructure can support the specific characteristics required for each traffic type. VMware recommends that vSphere and Network administrators should work together to understand the impact of this dynamic traffic scheduling over physical network infrastructure before deploying this approach.

Every customer environment is different, and the requirements for the traffic types are also different. Depending on the need of the environment, customer can modify these design options to fit their specific requirements. For example, customers can choose to use combination of static and dynamic design option when they need higher bandwidth for iSCSI and vMotion activities. In this hybrid design four uplinks can be statically allocated to iSCSI and vMotion traffic types while remaining four uplinks can be used dynamically for the remaining traffic types. The Table 4 below shows the traffic types and associated port group configuration for the hybrid design.

Table 4 Hybrid design configuration

Traffic Type

Port Group

Teaming Option

Active Uplink

Standby Uplink

NIOC Shares

NIOC Limits

Management

PG-A

LBT

1,2,3,4

None

5

-

vMotion

PG-B1

None

5

6

-

-

vMotion

PG-B2

None

6

5

-

-

FT

PG-C

LBT

1,2,3,4

None

10

-

iSCSI

PG-D1

None

7

None

-

-

iSCSI

PG-D2

None

8

None

-

 

Virtual Machine

PG-E

LBT

1,2,3,4

None

20

-

 

In the next blog entry I will discuss the Rack server deployment with two 10 Gigabit network adapters.

 

DELL’s Multipath Extension Module for EqualLogic now supports vSphere 5.0

DELL recently released their new Multipath Extension Module (MEM) for the EqualLogic PS Series of storage array. This updated MEM now supports vSphere 5.0.

I guess I should try to explain what a MEM is before going any further. VMware implements a Pluggable Storage Architecture (PSA) model in the VMkernel. This means that storage array vendors can write their own multipathing modules to plugin to the VMkernel I/O path. These plugins can co-exist alongside VMware’s own default set of modules. There are different modules for different tasks in the PSA. For instance, the specific details of handling path failover for a given storage array are delegated to the Storage Array Type Plugin (SATP). SATP is associated with paths. The specific details for determining which physical path is used to issue an I/O request (load balancing) to a storage device are handled by a Path Selection Plugin (PSP). PSP is associated with logical devices. The SATP & PSP are both MEMs (Multipath Extension Modules).

DELL’s MEM is actually a PSP. This means that it will take care of load balancing of I/O requests across all paths to the PS series arrays. DELL created a good Technical Report (TR) on their MEM which can be found here.

I spoke with Andrew McDaniel, one of DELL’s Lead Architects for VMware based in Ireland, and he was able to supply me with some additional information about this MEM. Firstly, since the MEM is essentially a PSP, devices from the EqualLogic array continue to use the Native Multipath Plugin (NMP) from VMware. This handles basic tasks like loading and unloading of MEMs, path discovery and removal, device bandwidth sharing between VMs, etc.

Any ESXi host with the DELL MEM installed will now have an additional Path Selection Policy. VMware ships ESXi with 3 default PSPs, and the DELL MEM makes up the fourth one. The list of installed PSPs can be shown via the command: esxcli storage nmp psp list

Nmp-psp-list

As you can see, the three standard PSPs are shown (VMW_PSP_MRU, VMW_PSP_RR & VMW_PSP_FIXED). The additional PSP is from DELL – DELL_PSP_EQL_ROUTED.

You might ask why would I need this additional PSP on top of the default ones from VMware. Well, the VMware ones are not optimized on a per array basis. Yes, they will work just fine, but they do not understand the behaviour of each of the different back-array. Therefore their behaviour is what could be described as generic.

DELL’s MEM module has been developed by DELL’s own engineering team who understand the intricacies of the EqualLogic array and can therefore design their MEM to perform optimally when it comes to load balancing/path selection.

If we take a look at one of the LUNs from the EqualLogic array using the esxcli storage nmp device list command, we can see which PSP and SATP are associated with that device and its paths:

Nmp-device-list

Here we can see both the SATP and the PSP that the device is using, as well as the number of working paths.  The SATP VMW_SATP_EQL is a VMware default one for EqualLogic arrays. And of course the PSP is DELL_PSP_EQL_ROUTED. The ‘ROUTED’ part refers ot DELL’s MEM being able to intelligently route I/O requests to the array path best suited to handle the request.

What are those ‘does not support device configuration’ messages? These are nothing to worry about. Some MEMs support configuration settings like preferred path, etc. These messages simply mean there are no configuration settings for this MEM.

The other nice part of DELL’s MEM is that it includes a setup script which will prompt for all relevant information, including vSwitch, uplinks, and IP addresses, and correctly setup a vSwitch for iSCSI & heartbeating, saving you a lot of time & effort. Nice job DELL!
If you are a DELL EqualLogic customer, you should definitely be checking this out. Simply login to https://support.equallogic.com and go to Downloads VMware Integration section. You will need a customer login to do this.

You should also be aware that vSphere 5.0 had an issue with slow boot times when iSCSI is configured. To learn more, refer to this blog post and referenced KB article.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

VDS Best Practices – Virtual and Physical Switch Parameters (Part 2 of 6)

Important virtual and physical switch parameters

Before diving into the different design options around the example deployment, let’s take a look at the VDS (virtual) and physical network switch parameters that should be considered in all these design options. These are some key parameters that vSphere and network administrators have to take into account while designing VMware virtual networking. As the configuration of virtual networking goes hand in hand with physical network configuration, this section will cover both the VDS and Physical switch parameters.

VDS parameters

VDS simplifies the challenges of the configuration process by providing one single pane of glass to perform virtual network management tasks. As opposed to configuring vSphere standard switches (VSS) on individual hosts, administrators can configure and manage one single vSphere distributed switch. All centrally configured network policies on VDS get pushed down to the host automatically when the host gets added to the distributed switch. In this section an overview of key VDS parameters is provided.

Host Uplink Connections (vmnics) and dvuplink parameter

VDS has a new abstraction for the physical Ethernet network adapters (vmnics) on each host. This new abstraction is called dvuplinks that gets defined during the creation of the VDS. All the properties including NIC teaming, load balancing, and failover policies on VDS and dvportgroups are applied to dvuplinks and not to vmnics on individual hosts. When a host gets added to the VDS, each vmnic on the host is mapped to a dvuplink. This provides the advantage of consistently applying the teaming and failover configurations to all the hosts irrespective of how the dvuplink and vmnic assignments are made.

The Figure 1 below shows two ESXi hosts with four Ethernet network adapters each. When these hosts are added to the VDS with four dvuplinks configured on a dvuplink portgroup, administrators have to assign the network adapters (vmnics) of the hosts to dvuplinks. To illustrate the mapping of the dvuplinks to vmnics Figure 1 shows one type of mapping where ESXi hosts vmnic0 is mapped to dvuplink1 and vmnic1 to dvuplink2 and so on. Customers can choose different mapping if required where vmnic0 can be mapped to different dvuplink instead of dvuplink1. VMware recommends having consistent mapping across different hosts because it reduces complexity in the environment. 

  Dvuplink_to_NIC_mapping

Figure 1 dvulpink to vmnic mapping

As a best practice, customers should also try to deploy hosts with same number of physical Ethernet network adapters and with similar port speeds. Also, as the number of dvuplink configuration on VDS depends on the maximum number of physical Ethernet network adapters on a host, administrators should take that into account during dvuplink portgroup configuration. Customers always have an option to modify this dvuplink configuration based on the new hardware capabilities.

Traffic Types and dvportgroup parameters

Similar to portgroups on standard switches, dvportgroups define how the connection is made through the VDS to the network. The VLAN ID, traffic shaping, port security, teaming and load balancing parameters are configured on these dvportgroups. The virtual ports (dvports) connected to a dvportgroup share the same properties configured on a dvportgroup. When customers want a group of virtual machines to share the security and teaming policies, they have to make sure the virtual machines are part of one dvportgroup. Customers can choose to define different dvportgroups based on the different traffic types they have in their environment or based on the different tenants or applications they support in the environment.

 In this example deployment, the dvportgroup classification is based on the traffic types running in the virtual infrastructure. Once administrators understand the different traffic types in the virtual infrastructure and identify specific security, reliability and performance requirements for individual traffic types, the next step is to create unique dvportgroups associated with each traffic type. As mentioned earlier, the dvportgroup configuration defined at VDS level is automatically pushed down to every host that is added to the VDS. For example, in Figure 1, you can see that the two dvportgroup PG-A (Yellow) and PG-B (Green) defined at the distributed switch level are available on each of the ESXi host that is part of that VDS.

 

dvportgroup specific configuration

Once customers decide on the number of unique dvportgroups they want to create in their environment, they can start configuring those dvportgroups. The configuration options/parameters are similar to those available with port groups on vSphere standard switches. There are some additional options available on VDS dvportgroup that are related to teaming setup. These new options are not available on vSphere standard switches. Customers can configure the following key parameters for each dvportgroup.

  • Number of virtual ports (dvports)
  • Port binding (static, dynamic, ephemeral)
  • VLAN Trunking/Private VLANs
  • Teaming and Load Balancing along with Active and Standby Links
  • Bi-directional traffic shaping parameters
  • Port Security

As part of the teaming algorithm support, VDS provides a unique approach to load balance traffic across the teamed network adapters. This approach is called Load Based Teaming (LBT), and it distributes the traffic across the network adapters based on the percentage utilization of traffic on those adapters. LBT algorithm works on both ingress and egress direction of the network adapter traffic as opposed to the hashing algorithms that work only in egress direction (traffic flowing out of the network adapter). Also, LBT prevents the worst-case scenario that could happen with hashing algorithms where all traffic hashes to one network adapter of the team and other network adapters are not used to carry any traffic. To improve the utilization of all the links/network adapters, VMware recommends the use of this advanced feature (LBT) of VDS. The LBT approach is recommended over the Etherchannel on physical switches and route based IP hash configuration on the virtual switch.

Port security policies at port group level allow customer protection from certain behaviors that could compromise security. For example, a hacker could impersonate a virtual machine and gain unauthorized access by spoofing the virtual machines MAC address. VMware recommends to set the MAC address Changes and Forged Transmits to “Reject” to help protect against attacks launched by a rogue guest operating system. Set the Promiscuous Mode to “Reject” unless customers want to monitor the traffic for network troubleshooting or Intrusion detection purpose.

NIOC

Network I/O control (NIOC) is the traffic management capability available on VDS. The NIOC concept revolves around resource pools that are similar in many ways to the ones existing for CPU and Memory. vSphere and network administrators now can allocate I/O shares to different traffic types similar to allocating CPU and Memory resources to a VM. The share parameter specifies the relative importance of a traffic type over other traffics, and provides a guaranteed minimum when the different traffic competes for a particular network adapter. The shares are specified in abstract units numbered 1 to 100. Customers can provision shares to different traffic types based on the amount of resources each traffic type requires.

This capability of provisioning I/O resources is very useful in situations where there are multiple traffic types competing for resources. For example, in a deployment where vMotion and VM traffic types are flowing through one network adapter, it is possible that vMotion activity can impact the virtual machine traffic performance. In this situation, shares configured in NIOC provide the required isolation to the vMotion and VM traffic type and prevents one flow (traffic type) dominating other flow. NIOC configuration provides one more parameter that customers can utilize if they want to put any limits on a particular traffic type. This parameter is called the Limit. The Limit configuration specifies the absolute maximum bandwidth for a traffic type on a host. The configuration of limit parameter is specified in Mbps. NIOC limits and shares parameters only work on the outbound traffic i.e traffic that is flowing out of the ESXi host.

VMware recommends customers to utilize this traffic management feature whenever they have multiple traffic types flowing through one network adapter. This situation of multiple traffic type flowing through a network adapter is more prominent with 10 Gigabit Ethernet network deployments but can happen in 1 Gigabit Ethernet network deployments as well. The common use case for using NIOC in 1 Gigabit network adapter deployment is when the traffic from different workloads or different customer VMs is carried over the same network adapter. As multiple workload traffic flows through a network adapter, it becomes important to provide I/O resources based on the needs of the workload. With the release of vSphere 5, customers now can make use of the new user defined network resource pools capability and allocate I/O resource to the different workloads or different customer VMs depending on their needs. This user defined network resource pool feature provides the granular control in allocating I/O resources and meeting the SLA requirements for the virtualized tier 1 workloads.

 Bi-directional traffic shaping

Apart from NIOC, there is one more traffic-shaping feature that is available in the vSphere platform. This traffic-shaping feature can be configured on a dvportgroup or dvport level. Customers can shape both inbound and outbound traffic using three parameters: average bandwidth, peak bandwidth, and burst size. Customers who want more granular traffic shaping controls to manage their traffic types can take advantage of this capability of VDS along with NIOC feature. It is recommended to involve network administrators in your organization while configuring these granular traffic parameters. These controls only makes sense when there are oversubscription scenarios that are causing network performance issues. These oversubscription scenarios could be caused because of the oversubscribed physical switch infrastructure or virtual infrastructure. So it is very important to understand the physical and virtual network environment before making any bi-directional traffic-shaping configurations.

 

Physical Network switch parameters

The configuration of VDS and physical network switch should go hand in hand to provide resilient, secure and scalable connectivity to the virtual infrastructure. The following are some key switch configuration parameters customer should pay attention to.

VLAN

If VLANs are used to provide logical isolation between different traffic types it is important to make sure that those VLANs are carried over to the Physical switch infrastructure. To do so, enable VST (Virtual switch tagging) on the virtual switch, and trunk all VLANs to the physical switch ports.

Spanning Tree Protocol (STP)

Spanning Tree protocol is not supported on virtual switches and thus no configuration is required on VDS. But it is important to enable this protocol on the physical switches. STP makes sure that there are no loops in the network. As a best practice, customer should configure the following.

  • Use “portfast” on ESXi host facing physical switch ports. With this setting, network convergence on these switch ports will happen fast after the failure because the port will enter the Spanning tree forwarding state immediately, bypassing the listening and learning states
  • Use “BPDU guard” to enforce STP boundary. This configuration protects from any invalid device connection on the ESXi host facing access switch ports. As mentioned earlier, VDS doesn’t support Spanning Tree protocol and thus doesn’t send any Bridge Protocol Data Unit (BPDU) frames to the switch port.  However, if any BPDU is seen on these ESXi host facing access switch ports the BPDU guard feature puts that particular switch port in error-disabled state. The switch port is completely shut down and prevents affecting the Spanning Tree Topology.

The recommendation of enabling “portfast” and “BPDU guard” on the switch ports is valid only when customers connect non-switching/bridging devices to these ports. The switching/bridging devices can be hardware based physical boxes or servers running software based switching/bridging function. Customers should make sure that there is no switching/bridging function enabled on the ESXi hosts that are connected to the physical switch ports.

In the scenario where the ESXi host has a guest VM that is configured to perform bridging function, the VM will generate BPDU frames and send out to the VDS. The VDS then forwards the BPDU frames through the network adapter to the physical switch port. When the switch port configured with “BPDU guard” receives the BPDU frame, the switch disables the port and the VM looses connectivity. To avoid this network failure scenario while running software-bridging function on an ESXI host, customers should disable the “portfast” and “BPDU guard” configuration on the port and run the spanning tree protocol.

In case customers are concerned about the security hacks that can generate BPDU frames, they should make use of the VMware vShield App security product that can block the frames and protect the virtual infrastructures from such layer 2 attacks. Please refer to vShield product documentation for more details on how to secure your vSphere virtual infrastructure. http://www.vmware.com/products/vshield/overview.html

Link Aggregation setup

Link Aggregation is used to increase throughput and improve resiliency by combining multiple network connections. There are various proprietary solutions in the market along with vendor-independent IEEE 802.3ad (LACP) standard based implementation. All solutions establish a logical channel between the two end points using multiple physical links. In the vSphere virtual infrastructure the two ends of the logical channel are virtual switch (VDS) and physical switch. These two switches have to be configured with link aggregation parameters before the logical channel is established. Currently, VDS supports static link aggregation configuration and does not provide support for dynamic LACP. When customers want to enable link aggregation on a physical switch, they should configure static link aggregation on the physical switch and select IP hash as NIC teaming on the VDS.

When establishing the logical channel with multiple physical links, customers should make sure that the Ethernet network adapter connections from the host are terminated on a single physical switch. However, if customers have deployed clustered physical switch technology then the Ethernet network adapter connections can be terminated on two different physical switches. The clustered physical switch technology is referred by different names by networking vendors. For example, Cisco calls their switch clustering solution as VSS (Virtual Switching System) while Brocade calls it as VCS (Virtual Cluster Switching). Please refer to the networking vendor guidelines and configuration details while deploying switch-clustering technology.

Link State Tracking

Link state tracking is a feature available on Cisco switches to manage the link state of downstream ports (ports connected to Servers) based on the status of upstream ports (ports connected to Aggregation/Core switches). When there is any failure on the upstream links connected to aggregation or core switches, the associated downstream link status goes down. The server connected on the downstream link is then able to detect the failure and re-route the traffic on other working links. This feature thus provides the protection from network failures due to the down upstream ports in non-mesh topologies. Unfortunately, this feature is not available on all vendors’ switches, and even if it is available, it might not be referred to as link state tracking. Customers should talk to the switch vendors to find out if similar feature is supported on their switches.

The Figure 2 below shows the resilient mesh topology on the left and a simple loop free topology on the right. VMware highly recommends deploying the mesh topology shown on the left that provides highly reliable redundant design, and it doesn’t need link state tracking feature. Customers who don’t have the high-end networking expertise and are also limited with number of switch ports might prefer the deployment shown on the right. In this deployment customers don’t have to run the Spanning Tree Protocol because there are no loops in the network design. The downside of this simple design is when there is a failure on the link between the access and aggregation switch. In that failure scenario, the server will continue to send traffic on the same network adapter even when the access layer switch is dropping the traffic at the upstream interface.  To avoid this black holing of server traffic, customers can enable link state tracking on the virtual and physical switches and indicate any failure between access and aggregation switch layer to the server through link state information.

Mesh_no_mesh_topolgy
Figure 2 Resilient loop and no-loop topologies

VDS has default network failover detection configuration set as “Link status only”. Customers should keep this configuration if they are enabling the link state-tracking feature on physical switches. If link state tracking capability is not available on physical switches, and there are no redundant paths available in the design, then customers can make use of Beacon Probing feature available on VDS. Beacon probing function is a software solution available on virtual switches for detecting link failures upstream from the access layer physical switch to the aggregation/core switches. Beacon probing is most useful with three or more uplinks in a team.

Maximum Transfer Unit (MTU)

Make sure that the Maximum Transfer Unit (MTU) configuration matches across the virtual and physical network switch infrastructure.

 

After covering the important virtual and physical switch parameters and some recommended guidelines for each, we will take a look at the rack server deployments with multiple 1 Gigabit network adapters as well as two 10 Gigabit network adapters in the next blog entry.

 

Extended VMware Tools and Virtual Hardware Support in vSphere 5.0

Kyle Gleed, Sr. Technical Marketing Manager, VMware

I often get asked for advice on upgrading VMware Tools and Virtual Hardware following an ESXi host upgrade.  While the actual task of upgrading a VM is straight forward, the challenge comes because upgrading the tools/virtual hardware requires rebooting the VM, and with many VMs hosting production workloads there’s a lot of sensitivity around VM downtime.  

In the past VMware has recommended upgrading the VM tools/virtual hardware anytime you upgrade your ESXi host.  This meant following an ESXi host upgrade with a lot of VM upgrades and reboots.  However, starting with vSphere 5.0 this is no longer the case.  In 5.0 VMware has extended the tools/virtual hardware support matrix to allow VMs with older versions of tools/virtual hardware to run on newer ESXi host.  This allows customers to evaluate the added features and capabilities provided with the newer tools/virtual hardware versions and make an informed decision on whether or not to upgrade their VMs, and to avoid unnecessary upgrades.

The table below shows the virtual hardware support matrix for vSphere 5.0.  As you can see vSphere 5.0 supports VMs running virtual hardware versions 4, 7, and 8.  Additional information about virtual hardware support in 5.0 and how to upgrade a VMs virtual hardware is available in the vSphere Virtual Machine Administration Guide (page 85).

A2

In addition, the VMware Tools support matrix for each ESXi version is available from the VMware Product Interoperability Matrix.  Note that vSphere 5.0 supports VMs runnig VMware Tools version 4.0 and above.  In addition, VMware Tools 5.0 is also supported on older ESX/ESXi hosts (4.0 and above).

A1

How to configure ESXi to boot via Software iSCSI?

Introduction

VMware introduced support for iSCSI back in the ESX 3.x days. However, ESX could only boot from an iSCSI LUN if a hardware iSCSI adapter was used. Hosts could not boot via VMware's iSCSI driver using a NIC with special iSCSI capabilities.

It quickly became clear that there was a need for booting via Software iSCSI. VMware's partners are developing blade chassis containing blade servers, storage and network interconnects in a single rack. The blades are typically disk-less, and in many cases have iSCSI storage. The requirement is to have the blade servers boot off of an iSCSI LUN using NICs with iSCSI capabilities, rather than using dedicated hardware iSCSI initiators.

In ESXi 4.1, VMware introduced support for booting the host from an iSCSI LUN via the Software iSCSI adapter. Note that support was introduced for ESXi only, and not classic ESX.

Check that the NIC is supported for iSCSI Boot

Much of the configuration for booting via Software iSCSI is done via the BIOS settings of the NICs and the host. Ensure that you are using a compatible NIC by checking the VMware HCL. This is important, but be aware. If you select a particular NIC and you see iSCSI as a feature, you might assume that you are good to go with using it to boot. This is not the case.

Too see if a particular NIC is supported for iSCSI boot, you need to set the I/O Device Type to Network (not iSCSI) and then check the foot notes. If the foot notes states that iBFT is supported, then this card may be used for Boot from iSCSI (I'll explain iBFT later). Yes, this is all rather cryptic and difficult to follow in my opinion. I'm going to see if I can get this changed internally to make it a little more intuitive.

Steps to configure BIOS for Software iSCSI Boot

Now that you have verified that your NIC is supported, lets move on to the configuration steps. First step is to go into the BIOS of the NIC and ensure that it is enabled for iSCSI Boot. Here is how one would do it on a HP DL series:

  Hp-bios
Similarly, here is how you would do this on a DELL PowerEdge R710:

  Dell-bios

The next step is to get into the NIC configuration. In my testing I used a Broadcom NetXtreme NIC, which comes with a boot agent. Broadcom’s Multi-Boot Agent (MBA) software utility enables a host to execute a boot process using images from remote servers, including iSCSI targets. You access the MBA by typing <Control>S during the boot sequence:

Ctrl-s
 This takes us into the MBA Configuration Menu:

Ctrl-k

Select iSCSI as the boot protocol. The key sequence CTRL-K will allow you to access the iSCSI  Configuration settings. If iSCSI isn’t available as a boot protocol, it may mean that the iSCSI firmware has not been installed, or that iSCSI has not been enabled on the NIC. There are a number of different parameters to configure. The main menu lists all available parameters.

Main-menu
First select General Parameters. In this example, I am going to use static IP information, so I need to set Disabled for the TCP/IP parameters via DHCP and iSCSI parameters via DHCP parameters.

When doing the initial install, Boot to iSCSI Target must also be left Disabled. You will need to change it to Enabled for subsequent boots. I'll tell you when later on in the post. You should therefore end up with settings similar to the following:

General-param2
Press <Esc> to exit the General Parameters Configuration Screen and then select Initiator Parameters. At the iSCSI Initiator Parameters Configuration screen, one would enter values for the IP Address, Subnet Mask, Default Gateway, Primary DNS, and Secondary DNS parameters as needed. If authentication is required then enter the CHAP ID (Challenge Handshake Authentication Protocol) and CHAP Secret parameters.

Init-params

Press <Esc> to return to the Main Menu and then select the 1st Target Parameters. Enter values for the Target IP Address, Target name, and Login information. The iSCSI Name corresponds to the iSCSI initiator name to be used by the client system. If authentication is required then enter the CHAP ID and CHAP Secret parameters. Note also that the Boot LUN ID (which LUN on the target we will use) is also selected here.

Target-params

Press <Esc> to return to the Main Menu and then press <Esc> again to display the Exit Configuration screen and then select Exit and Save the Configuration. That completes the BIOS configuration. We are now ready to install ESXi onto an iSCSI LUN via the software iSCSI initiator.

Steps to install ESXi onto an iSCSI LUN via Software iSCSI

After configuring the MBA parameters in the Broadcom NIC, you can now go ahead with the ESXi installation. The install media for ESXi is placed in the CDROM as per normal. The next step is to ensure that Boot Controller/device order is set in the BIOS. For Broadcom cards, the NIC should be before the CDROM in the boot order.

When the host is powered on, the system BIOS loads a NIC's OptionROM code and starts executing. The NIC's OptionROM contains bootcode and iSCSI initiator firmware. The iSCSI initiator firmware establishes an iSCSI session with the target.

On boot, a successful login to the target should be observed before installation starts. In this example, the iSCSI LUN is on a NetApp Filer. If you get a failure at this point, you need to revisit the configuration steps done previously. Note that this screen doesn't appear for very long:

 Success-login

The installation now begins.

As part of the install process, what could best be described as a memory-only VMkernel is loaded. This needs to discover suitable LUNs for installation, one of which is the iSCSI LUN. However, for the VMkernel's iSCSI driver to communicate with the target, it needs the TCP/IP protocol to be setup. This is all done as part of one the start-up init script. The NIC's OptionROM is also responsible for handing-off the initiator and target configuration data to the VMkernel. The hand-off protocol is called iBFT (iSCSI Firmware Boot Table). Once the required networking is setup, an iSCSI session is established to the target configured in the iBFT and LUNs beneath the targets are discovered and registered with VMkernel SCSI stack (PSA).

If everything is successful during the initial install, you will be offered the iSCSI LUN as a destination for the ESXi image, similar to the following:

Disk-selection
You can now complete the ESXi installation as normal.

 Steps to Boot ESXi from an iSCSI LUN via Software iSCSI

Once the install has been completed, a single iSCSI Configuration change is required in the iSCSI Configuration General Parameter. The change is to set the 'Boot to iSCSI target' to Enabled.

  Boot-enabled

 Now you can reboot the host and it should boot ESXi from the iSCSI LUN via the software iSCSI initiator.

Gotchas

  1. Make sure your NIC is on the HCL for iSCSI boot. Remember to check the foot notes of the NIC.
  2. Make sure that your device has a firmaware version that supports iSCSI boot.
  3. Make sure that the iSCSI configuration settings for initiator and target are valid.
  4. Check the login screen to make sure your initiator can login to the target.
  5. Multipathing is not supported at boot, so ensure that the 1st target path is working.
  6. If you make changes to the physical network, these must be reflected in the iBFT.
  7. A new CLI command, esxcfg-swiscsi -b -q, displays the iBFT settings in the VMkernel.

 

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

 

Best Practice: How to correctly remove a LUN from an ESX host

Yes, at first glance, you may be forgiven for thinking that this subject hardly warrants a blog post. But for those of you who have suffered the consequences of an All Paths Down (APD) condition, you'll know  why this is so important.

Let's recap on what APD actually is.

APD is when there are no longer any active paths to a storage device from the ESX, yet the ESX continues to try to access that device. When hostd tries to open a disk device, a number of commands such as read capacity and read requests to validate the partition table are sent. If the device is in APD, these commands will be retried until they time out. The problem is that hostd is responsible for a number of other tasks as well, not just opening devices. One task is ESX to vCenter communication, and if hostd is blocked waiting for a device to open, it may not respond in a timely enough fashion to these other tasks. One consequence is that you might observe your ESX hosts disconnecting from vCenter.

We have made a number of improvements to how we handle APD conditions over the last number of releases, but prevention is better than cure, so I wanted to use this post to highlight once again the best practices for removing a LUN from an ESX host and avoid APD:

ESX/ESXi 4.1

Improvements in 4.1 means that hostd now checks whether a VMFS datastore is accessible or not before issuing I/Os to it. This is an improvement, but doesn't help with I/Os that are already in-flight when an APD occurs. The best practices for removing a LUN from an ESX 4.1 host, as described in detail in KB 1029786, are as follows:

  1. Unregister all objects from the datastore including VMs and Templates
  2. Ensure that no 3rd party tools are accessing the datastore
  3. Ensure that no vSphere features, such as Storage I/O Control, are using the device
  4. Mask the LUN from the ESX host by creating new rules in the PSA (Pluggable Storage Architecture)
  5. Physically unpresent the LUN from the ESX host using the appropriate array tools
  6. Rescan the SAN
  7. Clean up the rules created earlier to mask the LUN
  8. Unclaim any paths left over after the LUN has been removed

Now this is a rather complex set of instructions to follow. Fortunately, we have made things a little easier with 5.0.

ESXi 5.0

The first thing to mention in 5.0 is that we have introduced a new Permanent Device Loss (PDL) condition – this can help alleviate some of the conditions which previously caused APD. But you could still run into it if you don't correctly remove a LUN from the ESX. There are details in the post about the enhancements made in the UI and the CLI to make the removal of a LUN easier. But there are KB articles that go into even greater detail.

To avoid the rather complex set of instructions that you needed to follow in 4.1, VMware introduced new detach and unmount operations to the vSphere UI & the CLI.

As per KB 2004605, to avoid an APD condition in 5.0, all you need to do now is to detach the device from the ESX. This will automatically unmount the VMFS volume first. If there are objects still using the datastore, you will be informed. You no longer have to mess about creating and deleting rules in the PSA to do this safely. The steps now are:

  1. Unregister all objects from the datastore including VMs and Templates
  2. Ensure that no 3rd party tools are accessing the datastore
  3. Ensure that no vSphere features, such as Storage I/O Control or Storage DRS, are using the device
  4. Detach the device from the ESX host; this will also initiate an unmount operation
  5. Physically unpresent the LUN from the ESX host using the appropriate array tools
  6. Rescan the SAN

This KB article is very good since it also tells you which features (Storage DRS, Storage I/O Control, etc) may prevent a successful unmount and detach.

Please pay particular attention to these KB articles if/when you need to unpresent a LUN from an ESX host.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

Host running ESXi 4.0 Update 2 may crash after vCenter Server is Upgraded to 5.0

Kyle Gleed, Sr. Technical Marketing Manager, VMware

Heads up for folks running ESXi 4.0 update 2.  You will need to apply ESXi 4.0 Update 3 prior to upgrading vCenter to 5.0.  For more info check out KB article 2007269.

Linked Clones Part 2 – Desktop Provisioning in VMware View 5.0

As the title suggests, this is the second blog post on how VMware is using linked clones in its products. The first blog, found here, covers how linked clones are used for fast provisioning in vCloud Director 1.5. This post will focus on how linked clones are used with provisioning desktops in VMware View 5.0, VMware's VDI product.

To recap, a linked clone is a duplicate of a virtual machine that uses the same base disk as the original, with a chain of delta disks to track the differences between the original and the clone.

Before you can begin to use linked clones in VMware View 5.0, you must first of all identify a Virtual Machine running an appropriate Guest OS (e.g. Windows 7 or XP) and install the VMware View Agent on that VM (the steps are beyond the scope of this post). You then need to power down the VM and create a VM snapshot. If you recollect from the previous post, only VMs which have snapshots can be used for link clones. In this example, I have a Windows XP VM which I will be using for the desktops that view will deploy.

So far, there is nothing that unusual about the disks or snapshots associated with this VM. In the VM home folder, there are two VMDKs; one is the original base disk and the other is the snapshot which I just created:

 # ls
COR-XP-PRO-000001-delta.vmdk
COR-XP-PRO-000001.vmdk
COR-XP-PRO-Snapshot1.vmsn
COR-XP-PRO-flat.vmdk
COR-XP-PRO.nvram
COR-XP-PRO.vmdk
COR-XP-PRO.vmsd
COR-XP-PRO.vmx
COR-XP-PRO.vmxf
vmware-1.log
vmware-2.log
vmware.log

Let's look at the metadata of the base disk. It appears that it is thick provisioned VMDK:

# cat COR-XP-PRO.vmdk
# Disk DescriptorFile
version=1
encoding="UTF-8"
CID=ceddb0f6
parentCID=ffffffff
isNativeSnapshot="no"
createType="vmfs"

# Extent description
RW 41943040 VMFS "COR-XP-PRO-flat.vmdk"

# The Disk Data Base
#DDB

ddb.deletable = "true"
ddb.toolsVersion = "8384"
ddb.virtualHWVersion = "8"
ddb.longContentID = "caae51e530f31cb41e9d0bc0ceddb0f6"
ddb.uuid = "60 00 C2 96 98 85 58 13-6b 06 53 2d a9 54 70 ae"
ddb.geometry.cylinders = "16383"
ddb.geometry.heads = "16"
ddb.geometry.sectors = "63"
ddb.adapterType = "ide"

While we're at it, we may as well check the snapshot. Nothing out of the ordinary either:

# cat COR-XP-PRO-000001.vmdk
# Disk DescriptorFile
version=1
encoding="UTF-8"
CID=ceddb0f6
parentCID=ceddb0f6
isNativeSnapshot="no"
createType="vmfsSparse"
parentFileNameHint="COR-XP-PRO.vmdk"
# Extent description
RW 41943040 VMFSSPARSE "COR-XP-PRO-000001-delta.vmdk"

# The Disk Data Base
#DDB

ddb.longContentID = "caae51e530f31cb41e9d0bc0ceddb0f6"
#

With the snapshot now taken, the next step is to create a Desktop Pool in the VMware View Administrator. I'm sure you can appreciate that there a considerable number of additional configuration steps required to get to this point, but these are beyond the scope of this post as I simply want to show you how linked clones are used.

In VMware View Administrator, select the Pools object under Inventory and then click the 'Add' button to launch the Add Pool wizard:

When you get to the vCenter Server part of the Pool Definition, there is an option to select Full Virtual Machines or View Composer linked clones. Obviously View Composer (another component of the VMware View Product) needs to be installed before you can select this option.

Later on in the Add Pool wizard, you will need to select a base desktop image for the parent VM, and this is where you would select the pre-prepared VM with snapshot that was previously setup:

After selecting the VM, you will then be prompted to select a snapshot associated with that VM (again, we set this up earlier on):

Once all the steps have been complete, and the pool is created, a new task begins in vCenter which clones the desktop base disk to a replica VM:

Once the replica is created, the creation of the XP desktops can begin. In my setup I asked View to make sure that my VMs are always powered on – this is why you see them deploying automatically. This speeds up end-user access to the VM:

The desktops are in fact made up of a number of different disks, but the disk used for the Guest OS is a linked clone created from the replica VM. Now you might ask why do we bother going through the process of cloning our original base disk into this replica disk before deploying the linked clones. The reason is simple – the replica is a thin provisioned version of the original base disk. By doing this step, we guarantee space savings on disk.

Let's take a look at this replica VM, and then one of my desktop VMs, first via the UI and then via the CLI. First thing I notice is that my replica VM has a snapshot:

If you remember, we need to make sure the base disk has a snapshot in order to create linked clones. If I look at my desktop VM, it also has a snapshot associated with it.

Why does my desktop VM need a snapshot? Well, View is quite powerful in what it can do with these desktops, and a snapshot is taken so that certain operation can be done which involve reverting the desktop's OS disk rather than redeploying a new one.

Let's first look at the desktop VM's disks:

# ls *.vmdk
XP-Desktop-01-000001-delta.vmdk
XP-Desktop-01-000001.vmdk
XP-Desktop-01-delta.vmdk
XP-Desktop-01-vdm-disposable-e12b58f4-a196-4dbf-874b-607d6d8a51a6-flat.vmdk
XP-Desktop-01-vdm-disposable-e12b58f4-a196-4dbf-874b-607d6d8a51a6.vmdk
XP-Desktop-01.vmdk
XP-Desktop-011-internal-flat.vmdk
XP-Desktop-011-internal.vmdk

It appears, that we have a delta disk (linked clone), a snapshot of the delta disk (vdm-initial-checkpoint) snapshot observed previously, and internal disk and a disposable disk. The disposable disk contains temporary files that are deleted when the virtual desktop is powered off, such as Windows temp files. The internal disk stores the computer account password to ensure connectivity to the domain when a desktop is refreshed. Additionally, the configuration for Quickprep and Sysprep are stored in this disk.

The delta disk (our linked clone) is where the Guest OS resides. Let's examine that in more detail:

# cat XP-Desktop-01.vmdk
# Disk DescriptorFile
version=1
encoding="UTF-8"
CID=4ac637ed
parentCID=ceddb0f6
isNativeSnapshot="no"
createType="vmfsSparse"
parentFileNameHint="/vmfs/volumes/7440912b-553c8e52/replica-c6e388c7-1a23-41d3-a264-0292fe29eb1f/replica-c6e388c7-1a23-41d3-a264-0292fe29eb1f.vmdk"
# Extent description
RW 41943040 VMFSSPARSE "XP-Desktop-01-delta.vmdk"

# The Disk Data Base
#DDB

ddb.longContentID = "5d89a2289098fd1e98a7f74c4ac637ed"

As can be clearly seen, this linked clone points back to the replica VMDK, which is a clone of our original base disk. Let's just check the metadata file of that replica disk as a final step:

# cat replica-c6e388c7-1a23-41d3-a264-0292fe29eb1f.vmdk
# Disk DescriptorFile
version=1
encoding="UTF-8"
CID=ceddb0f6
parentCID=ffffffff
isNativeSnapshot="no"
createType="vmfs"

# Extent description
RW 41943040 VMFS "replica-c6e388c7-1a23-41d3-a264-0292fe29eb1f-flat.vmdk"

# The Disk Data Base
#DDB

ddb.adapterType = "ide"
ddb.thinProvisioned = "1"
ddb.geometry.sectors = "63"
ddb.geometry.heads = "16"
ddb.geometry.cylinders = "16383"
ddb.uuid = "60 00 C2 93 22 40 a1 6d-1a 9d ca a9 4b a1 26 ff"
ddb.longContentID = "caae51e530f31cb41e9d0bc0ceddb0f6"
ddb.deletable = "false"
ddb.toolsVersion = "8384"
ddb.virtualHWVersion = "8"

If you compare this to the original XP VM that I used for my base disk, you'll see that this replica is identical in every characteristic except for one; the replica is thin provisioned so that it is space efficient.

That concludes the post. Hopefully you'll see how linked clones are used in VMware View to facilitate speed of deployment and space efficiency over standard cloning mechanisms.

Obviously a number of VMware View fundamentals were glossed over in this posting to show you the link clones. If you want to know more about the various operations that can be done such as refresh/recompose, this KB article does a pretty decent job. For more information about the various disks (internal, disposable, etc), this blog post does an excellent job. Finally, if you are considering deploying VMware View in your own lab for a Proof-Of-Concept, this blog post provides super step-by-step instructions.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage