Home > Blogs > VMware vSphere Blog


vSphere 5.1 – VDS New Features – Network Health Check

This is the second blog of the VDS – New Features series. In this one I will talk about the Network Health check feature – what it is and how it works. Let’s first take a look at some operational challenges that vSphere and network administrators face when it comes to network configuration in the vSphere environment.

When you look at the configuration process of virtual network, it involves configuring parameters on port groups of a virtual switch. As a vSphere administrator, you make sure that the configuration you are performing on port groups matches the physical switch configurations. However, this configuration process always doesn’t go that smoothly, either due to typing errors or multiple people involved in configuration process. Especially, when you have different teams managing virtual and physical switch configurations.

Our support team has received so many customer calls related to the mis-configuration of VLAN, Teaming and MTU parameters across the virtual and physical switches. These types of misconfigurations are very difficult to troubleshoot unless you painstakingly go through each parameter configuration manually.

To address this operational challenge, in vSphere 5.1 release, we have introduced Network Health Check feature. This feature detects any mis-configuration of VLAN, MTU and Teaming parameter across the virtual and first hop physical switch (Access layer switch). Default, this feature is not enabled and can be enabled only through the new vCenter Server web client as shown below. Go to Manage tab and select Health Check. Then click Edit and enable VLAN and MTU check or Teaming and Failover check or both.

Network Health Check Configuration

After the feature is enabled, at regular intervals, Layer-2 Ethernet packets are exchanged across the host uplinks to detect any mis-configuration. The default Network Health Check interval is programmed for one minute. However, this interval can be changed through the vSphere API configuration commands.

The following are the requirements for this feature to operate.

-       There should to be at least two uplinks configured on VDS for VLAN and MTU check to work.

-       Along with minimum two-uplink requirement, there should be at least two hosts on a VDS for teaming check to work.

Let’s now take a look how these parameters are checked across the physical switch configuration. First, the figure below shows the Req and Ack Layer-2 Ethernet packet format. These are the packets exchanged across the Host uplinks.

Packet Formats

The Request packet is a broadcast packet, which includes information about the source and destination Host ID, VDS ID and Uplink Port #. On the other hand, the response packet is a unicast packet with similar host, VDS and Port ID information.

To explain the exchange of packets, we will take an example of two hosts and one VDS deployment, and walk through the steps how a VLAN and MTU check is performed. As shown in the figure below, the two hosts have two uplinks each connected to the same access layer physical switch. On the VDS you see that an orange port group is configured with VLAN – 20 and VDS level MTU is 9000. The physical switch ports are also configured with the same parameters. So this a proper configuration scenario.

Example 1 – Proper Configuration

The following is the Sequence of packet flow during the VLAN and MTU check process.

-       From Host 1 a Broadcast Request frame (9k size) is sent (red circle)

-       That packet gets broadcasted to different ports of the physical switch that are configured with VLAN 20. In this case since all switch ports are configured with proper VLAN, packets will reach Host 2

-       Host 2 then responds with a Response unicast Ack. Ethernet packet (green circle) towards Host 1. The VDS reports that the configuration is good.

Now let’s take an example where the configuration across virtual and physical switch is not correct. The diagram below shows that port group is configured with VLAN 20 and MTU is 9000 while on the physical switch the port configuration is VLAN 10 and MTU 1500.

Example 2 – VLAN and MTU Mis-Configuration

The following is the Sequence of packet flow during the VLAN and MTU check process.

-       From Host 1 a Broadcast Request frame (9k size) is sent (red circle)

-       That packet doesn’t get broadcasted to different ports of the physical switch because only VLAN 10 traffic is accepted on the switch ports. Thus the Req. packet doesn’t reach Host 2.

-       The Host 1 waits for the Ack. If no Ack is received means something is wrong in the configuration. It could be Jumbo frame or VLAN configuration error.

-       To find out if the VLAN configuration is wrong another Req packet is generated with small size packet.

-       If still no response, VDS reports that both VLAN and MTU configuration is mismatch.

The checking of the Teaming configuration is little bit involved, but you can get an idea from this example how the network health check detects these parameter misconfigurations.

Please let me know if you have any specific questions on this feature.

Get notification of these blogs postings and more VMware Networking information by following me on Twitter:  @VMWNetworking

7 thoughts on “vSphere 5.1 – VDS New Features – Network Health Check

  1. Pingback: Configuring VDS Network Health Check Interval Using vSphere API | VMware vSphere Blog - VMware Blogs

  2. Pingback: vSphere 5.1 – VDS New Features – Link Aggregation Control Protocol (LACP) | VMware vSphere Blog - VMware Blogs

  3. Pingback: vSphere 5.1 – VDS New Features – Rollback Recovery and Backup Restore | VMware vSphere Blog - VMware Blogs

  4. Chirag Patel

    Good article. I am assuming that the Network Health Check will also catch NIC teaming problems where the Link is up on the NIC but the upstream switches have problems and not able to switch/route traffic. Would that be correct?

    1. Vyenkatesh Deshpande Post author

      Network Health Check can be used only to detect configuration issues with directly connected switches (Access switches). If there is any failure between Access and Aggregation layer that won’t be detected by NHC. There is another feature called beacon probing that can be used to detect the issue you are describing.

  5. Dave

    I had two problems with this health check:
    1. On several of the parameters that the health check reports on there are ?, not sure why the health check cannot discern certain information
    2. Running ESXTOP I noticed dropped packets on all of my VMs. The column %DRPRX shows all zeros, then all VMs show some percentage drop, then back to all zeros. After troubleshooting this with VMWare support I stumbled upon the solution. As soon as I disabled the VDS Health Check the dropped packets disappeared. It appears the broadcast packets sent out by the VDS health check were being dropped by my VMs.

    Any ideas on either of these problems?

Comments are closed.