Rack Server with Two 10 Gigabit Ethernet network adapters
The two 10 Gigabit Ethernet network adapters deployment model is becoming very common because of the benefits they provide through I/O consolidation. The key benefits include better utilization of I/O resources, simplified management, and reduced CAPEX and OPEX. While this deployment provides these benefits, there are some challenges when it comes to the traffic management aspects. Specially, in highly consolidated virtualized environments where more traffic types are carried over fewer 10 Gigabit Ethernet network adapters, and it becomes critical to prioritize traffic types that are important and provide the required SLA guarantees. The NIOC feature available on the VDS helps in this traffic management activity. In the following sections you will see how to utilize this feature in the different designs.
As shown in Figure 1, the rack servers with two 10 Gigabit Ethernet network adapters are connected to the two access layer switches to avoid any single point of failure. Similar to the Rack server with eight 1 Gigabit Ethernet network adapters section, the different VDS and Physical switch parameter configurations are taken into account during this design. On the physical switch side, the new 10 Gigabit switches might have support for FCoE that allows convergence for SAN and LAN traffic. This document only covers the standard 10 Gigabit deployments that support IP storage traffic (iSCSI/NFS) and not FCoE.
In this section two design options are described; one is a traditional approach and other one is a VMware recommended approach.
Figure 1 Rack server with 2 – 10 Gig NICs
Design Option 1 – Static Configuration
The static configuration approach for rack server deployment with 10 Gigabit Ethernet network adapters is similar to the one described in the design option 1 of rack server deployment with eight 1 Gigabit Ethernet adapters. There are few differences in the configuration where the numbers of dvuplinks are changed from eight to two, and dvportgroup parameters are different. Let’s take a look at the configuration details on the VDS front.
dvuplink configuration
To support the maximum two Ethernet network adapters per host, the dvuplink port group is configured with 2 dvuplinks (dvuplink1, dvuplink2). On the hosts the dvuplink1 is associated with vmnic0 and dvuplink2 is associated with vmnic1.
dvportgroups configuration
As described in the Table 1, there are five different dvportgroups that are configured for the five different traffic types. For example, dvportgroup PG-A is created for the management traffic type. Following are the other key configurations of dvportgroup PG-A:
- Teaming Option: Explicit Failover order provides a deterministic way of directing traffic to a particular uplink. By selecting dvuplink1 as an Active uplink and dvuplink2 as standby uplink the management traffic will be carried over dvuplink1 unless there is a failure of dvuplink1. It is also recommended to configure the failback option to “No” to avoid the flapping of traffic between two NICs. The failback option determines how a physical adapter is returned to active duty after recovering from a failure. If failback is set to No, a failed adapter is left inactive even after recovery until another currently active adapter fails, requiring its replacement.
- VMware recommends isolating all traffic types from each other by defining separate VLAN for each dvportgroup.
- There are various other parameters that are part of the dvportgroup configuration. Customers can choose to configure these parameters based on their environment needs.
Table 1 below provides the configuration details for all the dvportgroups. According to the configuration, dvuplink1 carries Management, iSCSI, and Virtual Machine traffic while dvuplink2 handles the vMotion, FT, and Virtual Machine traffic. As you can see, Virtual machine traffic type makes use of two uplinks, and these uplinks are utilized through the load based teaming (LBT) algorithm.
In this deterministic teaming policy, customers can decide to map different traffic types to the available uplink ports depending on the environment needs. For example, if iSCSI traffic needs higher bandwidth and other traffic types have relatively low bandwidth requirements, then customers can decide to keep only iSCSI traffic on dvuplink1 and move all other traffic to dvuplink2. When deciding on these traffic paths, customers should understand the physical network connectivity and the paths bandwidth capacity.
Physical switch configuration
The external physical switch, where the rack servers’ network adapters are connected to, is configured with trunk configuration with all the appropriate VLANs enabled. As described in the physical network switch parameters sections, following switch configurations are performed based on the VDS setup described in Table 1.
- Enable STP on the trunk ports facing ESXi hosts along with “port fast” mode and “bpdu” guard.
- The teaming configuration on VDS is static and thus no link aggregation is configured on the physical switches.
- Because of the mesh topology deployment as shown in Figure 1, the link state-tracking feature is not required on the physical switches.
Table 1 Static design configuration
Traffic Type |
Port Group |
Teaming Option |
Active Uplink |
Standby Uplink |
Unused Uplink |
Management |
PG-A |
Explicit Failover |
dvuplink1 |
dvuplink2 |
None |
vMotion |
PG-B |
Explicit Failover |
dvuplink2 |
dvuplink1 |
None |
FT |
PG-C |
Explicit Failover |
dvuplink2 |
dvuplink1 |
None |
iSCSI |
PG-D |
Explicit Failover |
dvuplink1 |
dvuplink2 |
None |
Virtual Machine |
PG-E |
LBT |
dvuplink1/ dvuplink2 |
None |
None |
This static design option provides the flexibility in the traffic path configuration but it cannot protect against one traffic type dominating others. For example, there is a possibility that network intensive vMotion process can take away most of the network bandwidth and impact virtual machine traffic. Bi-directional traffic shaping parameters at portgroup and port level can provide some help in managing different traffic rates. However, using this approach for traffic management requires customers to limit the traffic on the respective dvportgroups. Limiting traffic to a certain level through this method puts a hard limit on the traffic types even when the bandwidth is available to utilize. This underutilization of I/O resources because of hard limits is overcome through the NIOC feature, which provides flexible traffic management based on shares parameter. The design option 2 described below is based on the NIOC feature.
Design Option 2 – Dynamic Configuration with NIOC and LBT
This dynamic design option is the VMware recommended approach that takes advantage of the NIOC and LBT features of the VDS.
The connectivity to physical network infrastructure remains same as described in the design option 1. However, instead of allocating specific dvuplinks to individual traffic types, the ESXi platform utilizes those dvuplinks dynamically. To illustrate this dynamic design, each virtual infrastructure traffic type’s bandwidth utilization is estimated. In a real deployment, customers should first monitor the virtual infrastructure traffic over a period of time to gauge the bandwidth utilization, and then come up with bandwidth numbers.
Following are some bandwidth numbers estimated per traffic type:
- Management Traffic (< 1 Gig)
- vMotion (2 Gig)
- FT (1 Gig)
- iSCSI (2 Gig)
- Virtual Machine (2 Gig)
These bandwidth estimates are different from the one considered with rack server deployment with eight 1 Gig network adapters. Let’s take a look at the VDS parameter configurations for this design. The dvuplink portgroup configuration remains same with two dvuplinks created for the two 10 Gigabit Ethernet network adapters. The dvportgroup configuration is as follows.
dvportgroups configuration
In this design all dvuplinks are active and there are no standby and unused uplinks as shown in Table 2. All dvuplinks are thus available for use by the teaming algorithm. Following are the key configurations of dvportgroup PG-A:
- Teaming Option: Load based teaming is selected as the teaming algorithm. With LBT configuration, the Management traffic initially will be scheduled based on the virtual port ID hash. Based on the hash output the management traffic will be sent out over one of the dvuplink. Other traffic types in the virtual infrastructure can also be scheduled on the same dvuplink With LBT configuration. Subsequently, if the utilization of the uplink goes beyond 75% threshold, LBT algorithm will be invoked and some of the traffic will be moved to other underutilized dvuplinks. It is possible that Management traffic will get moved to other dvuplinks when such event occurs.
- There are no standby dvuplinks in this configuration so the failback setting is not applicable for this design approach. The default setting for this failback option is “Yes”.
- VMware recommends isolating all traffic types from each other by defining separate VLAN for each dvportgroup.
- There are several other parameters that are part of the dvportgroup configuration. Customers can choose to configure these parameters based on their environment needs.
As you follow the dvportgroups configuration in Table 2, you can see that each traffic type has all the dvuplinks as active and these uplinks are utilized through the load based teaming (LBT) algorithm. Let’s take a look at the NIOC configuration.
The Network I/O Control (NIOC) configuration in this design not only helps provide the appropriate I/O resources to the different traffic types but also provides SLA guarantees by protecting from one traffic type dominating others.
Based on the bandwidth assumptions made for different traffic types, the shares parameters are configured in NIOC shares column in Table 2. To illustrate how share values translate to bandwidth numbers in this deployment, let’s take an example of 10 Gigabit capacity dvuplink carrying all five traffic types. This is a worst-case scenario where all traffic types are mapped to one dvuplink. This will never happen when customers enable the LBT feature, because the LBT will move traffic type based on the uplink utilization. This example shows how much bandwidth each traffic type will be allowed on one dvuplink during a contention or oversubscription scenario and when LBT is not enabled
- Management: 5 shares; (5/75) * 10 Gigabit = 667 Mbps
- vMotion: 20 shares; (20/75) * 10 Gigabit = 2.67 Gbps
- FT: 10 shares; (10/75) * 10 Gigabit = 1.33 Gbps
- iSCSI: 20 shares; (20/75) * 10 Gigabit = 2.67 Gbps
- Virtual Machine: 20 shares; (20/75) * 10 Gigabit = 2.67 Gbps
- Total shares: 5 + 20 + 10 + 20 + 20 = 75
As you can see, for each traffic type first the percentage of bandwidth is calculated by dividing the share value by the total available share number (75), and then the total bandwidth of the dvuplink (10 Gigabit) is used to calculate the bandwidth share for the traffic type. For example, 20 shares allocated to vMotion traffic translate to 2.67 Gbps of bandwidth to vMotion process on a fully utilized 10 Gigabit network adapter.
In this 10 Gigabit Ethernet deployment, customers can provide bigger pipes to individual traffic types without the use of trunking or multipathing technologies. This was not the case with eight 1 Gigabit Ethernet deployment.
There is no change in physical switch configuration in this design approach. So please refer to the physical switch settings described in design option 1 in previous section.
Table 2 Dynamic design configuration
Traffic Type |
Port Group |
Teaming Option |
Active Uplink |
Standby Uplink |
NIOC Shares |
NIOC Limits |
Management |
PG-A |
LBT |
dvuplink1, 2 |
None |
5 |
– |
vMotion |
PG-B |
LBT |
dvuplink1, 2 |
None |
20 |
– |
FT |
PG-C |
LBT |
dvuplink1, 2 |
None |
10 |
– |
iSCSI |
PG-D |
LBT |
dvuplink1, 2 |
None |
20 |
– |
Virtual Machine |
PG-E |
LBT |
dvuplink1, 2 |
None |
20 |
– |
This design option utilizes the advanced VDS features and provides customer with a dynamic and flexible design approach. In this design I/O resources are utilized effectively and Service Level Agreements are met based on the shares allocation.
In the next blog entry I will talk about the Blade center deployments.