Home > Blogs > VMware vSphere Blog > Author Archives: Guy Brunsdon

IPv6 and vSphere 4.1

Those of you who read the “What’s new in Networking in vSphere 4.1” would have noticed that vSphere 4.1 is currently undergoing IPv6 conformance testing for the NIST Host Profile. (a little bit of recent history: the IPv6-ready program under the US JITC and US DoD was cancelled and replaced by the US NIST/USGv6 Conformance program).

Anyway, vSphere 4.1 is now listed on the UNH site for USGv6. UNH still have to finish up some of the final tests, and we still have to prepare our SDOC (Supplier Declaration of Conformance). You can see the current state of UNH USGv6 testing at  http://www.iol.unh.edu/services/testing/ipv6/usgv6tested.php

Under the Covers with VMware FT (Fault Tolerance)

By any measure, FT (Fault Tolerance) is a ground-breaking technology. Introduced a year ago with vSphere 4.0, FT enables an application to continue uninterrupted even after a complete and catastrophic physical server failure.

But how does it work? What is the impact or requirement upon the network? Dan Scales, Mike Nelson, and Ganesh Venkitachalam from the VMware engineering team that brought FT to life, recently published an in-depth, 27 page technical report on The Design and Evaluation of a Practical System for Fault-Tolerant Virtual Machines.” The paper goes deep with discussion on the FT protocol, the implementation issues for network I/O, disk I/O, benchmarks, and an assortment of other topics. It really is a fascinating read.

Got Network I/O Control?

vSphere 4.1 launched today with a bunch of great new features and improved performance.

Network I/O Control (NetIOC) was the major feature enhancement for networking. NetIOC is a must have for anyone using or considering 10GigE.

Why NetIOC?

NetIOC enables you to guarantee service levels (bandwidth) for particular vSphere traffic types. For example, if you’re concerned about iSCSI (or NFS) bandwidth/latency when a vMotion or some other activity fires up, or maybe you wish to protect your FT traffic from congestion; or of course, you wish to ensure your VM traffic meets minimum or required levels of service.

NetIOC can isolate and prioritize six traffic types:

  • VM traffic
  • FT Logging
  • iSCSI
  • NFS
  • Management
  • vMotion

Using the limits and shares parameters, you can tailor NetIOC precisely to the requirements of your environment.

NetIOC in Action

Sree from our performance engineering group performed a myriad of benchmarks and tests to see the effects with and without NetIOC. (The paper is posted here). One test involved running FT, NFS, VM traffic, and then seeing what happens when a vMotion starts. You can see the effect in the diagram below. The aggregate bandwidth usage before vMotion is ~4-5 Gbps. The vMotion (which with vSphere 4.1 can consume up to 8 Gbps), oversubscribes the 10GigE NIC, causing all traffic types to suffer. With NetIOC enabled with appropriate shares values, the critical traffic types are protected with vMotion consuming what remains of the link bandwidth.  

image

Here is another benchmark using NetIOC. This time using the SPECweb2005 benchmark. In this instance and without NetIOC, a vMotion causes 26% of the user sessions to fall below the service level requirements. 

image

New Network Features

Of course, we released a few other network features in vSphere 4.1. An overview is presented in the “What’s new in VMware vSphere 4.1: Virtual Networking” paper on vmware.com

  • Network I/O Control (NetIOC)—see above
  • Load Based Teaming (LBT)—dynamic balancing of VM vnics across a team according to load
  • IPv6 Enhancements–toward NIST “Host” Profile compliance
  • Improved VM-VM and vmkernel Network Performance—major performance improvements across the board (vMotion now tops 8Gbps!)

You can read all about NetIOC in this 26-page paper posted on vmware.com

Using 10GigE with VMware vDS and Cisco Nexus 1000V

In a blog post last month titled, “vSphere loves 10GigE”, I mentioned a deployment paper was in the works.

The paper titled, “Deploying 10 Gigabit Ethernet on VMware vSphere 4.0 with Cisco Nexus 1000V and VMware vNetwork Distributed Switches”, is now posted on Cisco website. We will follow suit in the coming week and post it at the usual place under resources at vmware.com/go/networking.

More on HP Virtual Connect

I wrote a blog entry earlier this year around using HP Flex-10 with vSphere … or perhaps I should say I wrote a pointer to an entry on Flex-10 written by Kenneth van Ditmarsch at virtualkenneth.com

I thought I’d add a few more papers that cover Virtual Connect with vSphere…

FCoE Webcast coming up

My previous post was on why vSphere loves 10GigE. You can converge all those 1GigE links into a pair of 10GigE to not only improve your network performance but reduce the complexity of your infrastructure.

FCoE takes reduction of complexity one step further in eliminating the HBA, Fibre Channel links and adjacent Fibre Channel switches/directors and carries that traffic over lossless Ethernet.

In conjunction with Cisco and Emulex, we’ve been running a SAN Virtuosity series of webcasts with accompanying co-authored papers.

The next SAN Virtuosity webcast is on the topic of FCoE on Wednesday, June 23, 2010 at 9:00am PDT. This session will cover how you can converge your SAN and LAN with vSphere using FCoE. You can register here.

vSphere loves 10GigE

It didn’t seem so long ago that 10GigE was just “one of those things we’ll look at soon”. A show of hands at our recent internal TechSummit conference suggests almost every customer is either implementing 10GigE in production or kicking the 10GigE tires in their pre-production labs.

Why 10GigE?

10GigE has a couple of obvious advantages over 1GigE:

  • Bandwidth—it’s 10x the bandwidth. To ensure packets from a particular flow are not reordered, all the teaming polices hash to one vmnic (physical NIC) within the team between a single source and destination. 10GigE provides more bandwidth for any traffic type or flow that would have been restricted to 1GigE. e.g. NFS, iSCSI, FT logging, vMotion, and individual VMs.
  • Management—2x 10GigE links has to be easier to manage and deploy than 6, 8, 10, or more 1GigE links

2x 10GigE as the typical Deployment Scenario

At this point in time, almost all 10GigE deployments will use two 10GigE interfaces linking to a pair of physical switches (top-of-rack or end-of row) with L2 continuity over all access VLANs between the two switches (so you’re not exposed to single switch/linecard failure and can failover to the other switch). 

Converged Traffic?

In the not too distant past, VMware had guided customers to dedicate vmnics to each of the various traffic types. In the world of 10GigE, there is no need to continue with this methodology. VLANs provide logical separation and 10GigE interfaces provide sufficient bandwidth and a better performing way to handle the vmkernel and VM traffic loads from ESX and ESXi hosts.  

Switches

If you want to use teaming and have some protection against single points of failure then both NICs must be on a single vswitch (vSS, vDS, or Nexus 1000V). If you’re using VLANs for traffic separation (of course you are!), there really is no need for multiple vswitches anyway.

Traffic types and Teaming Policies

There is no one right way to deploying 10GigE—it will depend upon your environment. Some things to consider:

  • IP Storage—NFS and/or iSCSI—are you using these? How much bandwidth do you need and will it consume if given 10GigE?
  • vMotion—a single vMotion can consume ~3.6Gbps with a maximum of two running concurrently.
  • Service console or management interface—it doesn’t use much bandwidth at all, but must be available.
  • FT logging—requires a lot of bandwidth and low latency (10GigE helps FT a lot) as it replicates the read I/O traffic and ingress data traffic to the secondary FT VM. In the current implementation, FT can consume up to ~4Gbps but can consume much less if the FT workloads are low. 
  • VM traffic—how much of it do you have? Are they particularly bursty or heavy consumers of bandwidth? Note that in in a 1GigE environment, each VM (assuming single vnic) was capped at 1GigE ingress/egress.

You could just apply “Originating Virtual Port ID” to the teaming policy on all Port Groups and dvPort Groups and it would work just fine. But, I prefer more deterministic control over the traffic flows. I like the method shown in the diagram below. This is applicable to vSS and vDS switches. The details are as follows:

  • VST (Virtual Switch Trunking) mode—trunk the required VLANs into the ESX/ESXi hosts over both 10GigE interfaces and ensure there is L2 continuity between eth two switches on each of those VLANs.
  • VM portgroups (or dvPortgroups)—active on one vmnic and standby on the other (vmnic0/vmnic1 in my example)
  • vmkernel portgroups (or dvPortgroups)—active on one vmnic and standby on the other in reverse to that for the VMs (i.e. vmnic1/vmnic0 in my example)

With both NICs active, this means that all VM traffic will use vmnic0, and all the vmkernel ports will use vmnic1. If there is a failure, then all traffic will converge onto the remaining vmnic. (note that when using a vDS, dvPortgroup teaming policies apply to the dvUplinks, which then map to the vmnics on each host)

Traffic Shaping?

As an additional option, you can employ the traffic shaper to control or limit the traffic on any one port. Moving all vmkernel ports to one vmnic means you can control and apply this more effectively. You only need to use this if you are concerned about one traffic type dominating others. Since vMotion, management traffic, and FT logging are effectively capped, this really only concerns iSCSI and NFS. So you may wish to apply the shaper to one or other of these to limit its effect on vMotion, FT and management. 

The traffic shaper is configured on the port group (or dvPortgroup). On the vSS, the shaper only applies to ingress traffic (relative to vswitch from VM or vmkernel port). In other words, it works in the southbound direction in my diagram. The vDS has bi-directional traffic shaping. You only need to apply it on the ingress (southbound) side.

Try it in your lab. In most cases I doubt you would need it, but it’s there. Do the math on the competing traffic types to work out your average and maximum bandwidth levels. 4Gbps is probably a good start as an average for iSCSI or NFS (which is way more than the 1Gbps you had before).

Note that you should not configure an average or maximum traffic shaping limit above 4Gbps (2^32). It’s a 32-bit field, so will apply a 4G modulus to whatever value you enter. e.g. 5Gbps will end up as 1Gbps.  

 

 image  

What if I’m using the Nexus 1000V?

There is absolutely no issue in using 10GigE with the Cisco Nexus 1000V. It’s a good thing for all the same reasons I mention above. When using the Nexus 1000V with Nexus 5000 top-of-rack switches, you should consider using vPC mode. This logically aggregates the n5k switches. If desired, you can also apply traffic shaping to individual ports via the port profile configuration.

Additional Guidance

We are preparing some documents that expand on what is outlined in this blog entry. We will go into a bit more detail around the configuration of the virtual and physical switches. You can expect to see these in the next few weeks.

Summary

My summary is quite short. If you have 10GigE, go ahead and use it, and enjoy the extra performance and management benefits.

Security Hardening Guide for vSphere 4.0 Published

The final release of the vSphere 4.0 Security Hardening Guide has hit the streets. This is a comprehensive 100+ page guide with more than 100 guidelines on hardening vSphere 4.0 components in a production environment.

You can read Charu’s post over on the VMware Security blog, or go directly here to download the paper from the VMware Communities site.   

Jumbo Frames in vSphere 4.0

In vSphere 4.0, we introduced support for jumbo frames on vmkernel interfaces with ESX 4.0 and ESXi 4.0. This meant you could use jumbo frames for iSCSI, NFS, FT, and vMotion in both releases. Unfortunately, we had a minor documentation bug that stated jumbo frames were not supported in ESXi. This has since been corrected. You can find the updated docs for ESXi 4.0 here and ESXi4.0u1 here.

What is a jumbo frame?

A jumbo frame is an Ethernet frame with a “payload” greater than 1500 Bytes and up to ~9000 Bytes. This is also known as the MTU (Maximum Transmission Unit). The payload is what is carried in the frame, so a standard (non-jumbo) Ethernet frame with MTU of 1500 could be up to 1522 bytes in length. The additional 22 bytes is comprised of destination mac address (6B), source mac address (6B), optional 802.1Q VLAN header (4B), ethertype (2B), and the 4Byte CRC32 trailer.

9000Bytes is generally accepted as the maximum size for a jumbo frame, however, I’ve seen some Cisco switches with MTUs of 9216 Bytes.

Why use jumbo frames?

It’s a case of getting maximum bang for the buck. Processing overhead is proportional to the number of frames; so if you can pack as much as possible into each frame, you will have less overhead and better top end performance. Most modern data center grade physical switches will switch line rate right down to the minimum frame size of 64 bytes, so the main impact is seen at the source and destination systems.   

Using Jumbo Frames

Everything in the end-to-end network path has to be capable of handling the frame size thrown at it. So if you enable jumbo frames (MTU = 9000) on ESX, you have to be sure that every physical switch and the other end(s) can handle that sized frame. Layer 2 switches will just drop jumbo frames if they are not configured for it. L3 switches/routers can “fragment” larger frames into smaller frames for reassembly at the destination, but it can cause a huge performance hit. … Don’t rely on IP fragmentation. If you’re going to use jumbo frames, make sure everything in the path is configured for it.

Enabling jumbo frames with ESX 4.0 and ESXi 4.0     

As stated above, you have to enable jumbo frames end-to-end. In ESX/ESXi do the following:

1. Enable jumbo frames on the virtual switch (set the MTU on the uplinks/physical NICs)

  • For vSS (standard vSwitch) you need to use the vSphere cli.  For example, this cli command will set the MTU to 9000 bytes for the vSS named “vswitch0”:
    vicfg-vswitch –m 9000 vswitch0     
    Use “vicfg-vswitch –l” to list the vswitches and their properties
  • For vDS (vNetwork Distributed Switch), you can set the MTU via the vSphere Client UI. From the Networking inventory menu, select the vDS and then “Edit Settings”. Set the “Maximum MTU” to the desired MTU (e.g. 9000B is most likely for jumbo).

2.  Enable jumbo frames on the vmkernel ports

  • Use the esxcfg-vmknic command to delete and then add a vmkernel interface with an MTU of 9000. On ESXi, there seems to be a glitch in creating a vmkernel port on a vDS through the vcli, so the workaround is to create a vmkernel interface with MTU 9000 on a standard switch and then migrate it over to the vDS through the vSphere Client.

    You can get the status (name/address/mask/MAC addr/MTU) of the vmkernel interfaces via
    esxcfg-vmknic -l

3. Enable jumbo frames on the physical switches

  • This will depend upon the make/type of switch, but remember to enable end-to-end for the traffic type in use.

To enable jumbo frames for guest VMs, use the Enhanced VMXNET or VMXNET3 virtual nics and enable jumbos through the guest OS.

F5 Accelerates Long Distance vMotion

At VMworld last year, F5 previewed and demonstrated their Long Distance vMotion solution. Since then, F5 has worked on integrating the solution into their BIG-IP product and leverage their new WAN Optimization Module (WOM).

What does this mean for VMware customers?

vMotion over distance on a traditional WAN is constrained by bandwidth and latency. Any packet loss will further throttle throughput as TCP congestion avoidance kicks in. In short distance and bandwidth constrains how far you go and how much you can pump through the link (and how many VMs you can migrate in a unit of time).

The F5 solution does a few things—it encrypts and compresses the data (i.e. less bits to send and it’s secure!); and it optimizes the transmission so it’s less susceptible to packet loss. The BIG-IP solution also automatically redirects the session to the new site.

The end result is a huge acceleration and increased success rate for Long Distance vMotions. Nojan published a table of results on the F5 DevCentral blog that showed a 3.1x to 4.7x improvement in time. e.g. vMotion time over a 1Gbps link with 20ms RTT dropped from 2:38 to 0:38. Another test over a tiny T3 link with 100ms RTT (that’s a loooonnng way) dropped from 13:43 to 3:35 with an increase in reliability of vMotion completions from <50% to 100%.

More info

F5 has produced a raft of information to help you understand and implement the solution.