vSphere loves 10GigE

It didn’t seem so long ago that 10GigE was just “one of those things we’ll look at soon”. A show of hands at our recent internal TechSummit conference suggests almost every customer is either implementing 10GigE in production or kicking the 10GigE tires in their pre-production labs.

Why 10GigE?

10GigE has a couple of obvious advantages over 1GigE:

Bandwidth—it’s 10x the bandwidth. To ensure packets from a particular flow are not reordered, all the teaming polices hash to one vmnic (physical NIC) within the team between a single source and destination. 10GigE provides more bandwidth for any traffic type or flow that would have been restricted to 1GigE. e.g. NFS, iSCSI, FT logging, vMotion, and individual VMs.
Management—2x 10GigE links has to be easier to manage and deploy than 6, 8, 10, or more 1GigE links

2x 10GigE as the typical Deployment Scenario

At this point in time, almost all 10GigE deployments will use two 10GigE interfaces linking to a pair of physical switches (top-of-rack or end-of row) with L2 continuity over all access VLANs between the two switches (so you’re not exposed to single switch/linecard failure and can failover to the other switch).

Converged Traffic?

In the not too distant past, VMware had guided customers to dedicate vmnics to each of the various traffic types. In the world of 10GigE, there is no need to continue with this methodology. VLANs provide logical separation and 10GigE interfaces provide sufficient bandwidth and a better performing way to handle the vmkernel and VM traffic loads from ESX and ESXi hosts.

Switches

If you want to use teaming and have some protection against single points of failure then both NICs must be on a single vswitch (vSS, vDS, or Nexus 1000V). If you’re using VLANs for traffic separation (of course you are!), there really is no need for multiple vswitches anyway.

Traffic types and Teaming Policies

There is no one right way to deploying 10GigE—it will depend upon your environment. Some things to consider:

IP Storage—NFS and/or iSCSI—are you using these? How much bandwidth do you need and will it consume if given 10GigE?
vMotion—a single vMotion can consume ~3.6Gbps with a maximum of two running concurrently.
Service console or management interface—it doesn’t use much bandwidth at all, but must be available.
FT logging—requires a lot of bandwidth and low latency (10GigE helps FT a lot) as it replicates the read I/O traffic and ingress data traffic to the secondary FT VM. In the current implementation, FT can consume up to ~4Gbps but can consume much less if the FT workloads are low.
VM traffic—how much of it do you have? Are they particularly bursty or heavy consumers of bandwidth? Note that in in a 1GigE environment, each VM (assuming single vnic) was capped at 1GigE ingress/egress.

You could just apply “Originating Virtual Port ID” to the teaming policy on all Port Groups and dvPort Groups and it would work just fine. But, I prefer more deterministic control over the traffic flows. I like the method shown in the diagram below. This is applicable to vSS and vDS switches. The details are as follows:

VST (Virtual Switch Trunking) mode—trunk the required VLANs into the ESX/ESXi hosts over both 10GigE interfaces and ensure there is L2 continuity between eth two switches on each of those VLANs.
VM portgroups (or dvPortgroups)—active on one vmnic and standby on the other (vmnic0/vmnic1 in my example)
vmkernel portgroups (or dvPortgroups)—active on one vmnic and standby on the other in reverse to that for the VMs (i.e. vmnic1/vmnic0 in my example)

With both NICs active, this means that all VM traffic will use vmnic0, and all the vmkernel ports will use vmnic1. If there is a failure, then all traffic will converge onto the remaining vmnic. (note that when using a vDS, dvPortgroup teaming policies apply to the dvUplinks, which then map to the vmnics on each host)

Traffic Shaping?

As an additional option, you can employ the traffic shaper to control or limit the traffic on any one port. Moving all vmkernel ports to one vmnic means you can control and apply this more effectively. You only need to use this if you are concerned about one traffic type dominating others. Since vMotion, management traffic, and FT logging are effectively capped, this really only concerns iSCSI and NFS. So you may wish to apply the shaper to one or other of these to limit its effect on vMotion, FT and management.

The traffic shaper is configured on the port group (or dvPortgroup). On the vSS, the shaper only applies to ingress traffic (relative to vswitch from VM or vmkernel port). In other words, it works in the southbound direction in my diagram. The vDS has bi-directional traffic shaping. You only need to apply it on the ingress (southbound) side.

Try it in your lab. In most cases I doubt you would need it, but it’s there. Do the math on the competing traffic types to work out your average and maximum bandwidth levels. 4Gbps is probably a good start as an average for iSCSI or NFS (which is way more than the 1Gbps you had before).

Note that you should not configure an average or maximum traffic shaping limit above 4Gbps (2^32). It’s a 32-bit field, so will apply a 4G modulus to whatever value you enter. e.g. 5Gbps will end up as 1Gbps.

What if I’m using the Nexus 1000V?

There is absolutely no issue in using 10GigE with the Cisco Nexus 1000V. It’s a good thing for all the same reasons I mention above. When using the Nexus 1000V with Nexus 5000 top-of-rack switches, you should consider using vPC mode. This logically aggregates the n5k switches. If desired, you can also apply traffic shaping to individual ports via the port profile configuration.

Additional Guidance

We are preparing some documents that expand on what is outlined in this blog entry. We will go into a bit more detail around the configuration of the virtual and physical switches. You can expect to see these in the next few weeks.