posted

12 Comments

Introduction

In vSphere 5.0, VMware introduced a new software FCoE (Fibre Channel over Ethernet) adapter. This means that if you have a NIC which supports partial FCoE offload, this adapter will allow you to access LUNs over FCoE without needing a dedicated HBA or third party FCoE drivers installed on the ESXi host.

The FCoE protocol encapsulates Fibre Channel frames into Ethernet frames. As a result, your host can use 10 Gbit lossless Ethernet to deliver Fibre Channel traffic. The lossless part is important and I'll return to that later.

Configuration Steps

A Software FCoE Adapter is software code that performs some of the FCoE processing. The adapter can be used with a number of NICs that support partial FCoE offload. Unlike the hardware FCoE adapter, the software adapter needs to be activated on an ESXi 5.0, similar to how the Software iSCSI adapter is enabled. Go to Storage Adapters in the vSphere client and click 'Add':

Picture1

To use the Software FCoE Adapter, the NIC used for FCoE must be bound as an uplink to a vSwitch  which contains a VMkernel portgroup (vmk). Since FCoE packets are exchanged in a VLAN, the VLAN ID must be set on the physical switch to which the adapter is connected, not on the adapter itself. The VLAN ID is automatically discovered during the FCoE Initialization Protocol (FIP) VLAN discovery process so there is no need to set the VLAN ID manually.

Picture2

 

Enhancing Standard Ethernet to handle FCoE

For Fiber Channel to work over Ethernet, there are a number of criteria which must be addressed, namely handling losslessness/congestion and bandwidth management.

1. Losslessness & Congestion: Fiber Channel is a lossless protocol, so no frames can be lost. Since classic Ethernet has no flow control, unlike Fibre Channel, FCoE requires enhancements to the Ethernet standard to support a flow control mechanism to prevents frame loss. One of the problems with Ethernet networks is that when a congestion condition arises, packets will be dropped (lost) if there is no adequate flow control mechanism. A flow control method that is similar to the buffer to buffer credit method on Fiber Channel is needed for FCoE.

FCoE uses a flow control PAUSE mechanism, similar to buffer-to-buffer credits in FC, to ask a transmitting device to hold off sending any more frames until the receiving device says its ok to resume. However, the PAUSE mechanism is not intelligent and could possibly pause all traffic, not just FCoE traffic. To overcome this, the quality of service (QoS) priority bit in the VLAN tag of the Ethernet frame is used to differentiate the traffic types on the network. Ethernet can now be thought of as being divided into 8 virtual lanes based on the QoS priority bit in the VLAN tag.

Picture3

Different policies such as losslessness, bandwidth allocation and congestion control can be applied these virtual lanes individually. If congestion arises and there is a need to ‘pause’ the Fiber Channel traffic (i.e. the target is busy processing, and wants the source to hold off sending any more frames), then there must be a way of pausing the FC traffic without impacting other network traffic using the wire. Let’s say that FCoE, VM traffic, VMotion traffic, management traffic & FT traffic are all sharing the same 10Gb pipe. If we have congestion with FCoE, we may want to pause it. However, we don’t want to pause all traffic, just FCoE. With standard ethernet, this is not possible – you have to pause everything. So we need an enhancement to pause one class of traffic while allowing the rest of the traffic to flow. PFC (Priority based Flow Control) is an extension of the current Ethernet pause mechanism, sometimes called Per-Priority PAUSE. using per priority pause frames. This way we can pause traffic with a specific priority and allow all other traffic to flow (e.g. pause FCoE traffic while allowing other network traffic to flow).

2. Bandwidth: there needs to be a mechanism to reduce or increase bandwidth. Again, with a 10Gb pipe, we want to be able to use as much of the pipe as possible when other traffic classes are idle. For instance, if we’ve allocated 1Gb of the 10Gb pipe for vMotion traffic, we want this to be available to other traffic types when there are no vMotion operations going on, and similarly we want to dedicate it to vMotion traffic when there are vMotions. Again, this is not achievable with standard ethernet so we need some way of implementing this. ETS (Enhanced Transmission Selection) provides a means to allocate bandwidth to traffic that has a particular priority. The protocol supports changing the bandwidth dynamically.

DCBX – Data Center Bridging Exchange

Data Center Bridging Exchange (DCBX) is a protocol that allows devices to discover & exchange their capabilities with other attached devices. This protocol ensures a consistent configuration across the network. The three purposes of DCBX are:
 

  1. Discover Capabilities: The ability for devices to discover and identify capabilities of other devices.  

  2. Identify misconfigurations: The ability to discover misconfigurations of features between devices.  Some features can be configured differently on each end of a link whilst other features must be configured identically on both sides. This functionality allows detection of configuration errors. 

  3. Configuration of Peers: A capability allowing DCBX to pass configuration information to a peer.

DCBX relies on Link Layer Discovery Protocol (LLDP) in order to pass this configuration information. LLDP is an industry standard version of Cisco Discovery Protocol (CDP) which allows devices to discover one another and exchange information about basic capabilities. This is why we need to bind a VMkernel port to a vSwitch.  Frames will be forwarded to the userworld dcdb process via the CDP VMkernel module to do DCBX. The CDP VMkernel module does both CDP & LLDP. This is an important point to note – the FCoE traffic does not go through the vSwitch. However we need the vSwitch binding so that the frames will be forwarded to the dcbd in userworld through the CDP VMkernel module for DCBx negotiation.  The vSwitch is NOT for FCoE data traffic.

Once the Software FCoE is enabled, a new adapter is created, and discovery of devices can now take place.

Picture4

 

References:

  • Priority Flow Control (PFC) – IEEE 802.1Qbb – Enable multiple traffic types to share a common Ethernet link without interfering with each other

  • Enhanced Transmission Selection (ETS) – IEEE 802.1Qaz – Enable consistent management of QoS at the network level by providing consistent scheduling

  • Data Center Bridging Exchange Protocol (DCBX) – IEEE 802.1Qaz – Management protocol for enhanced Ethernet capabilities

  • VLAN tag – IEEE 802.1q

  • Priority tag – IEEE 802.1p

 

Troubleshooting

The software FCoE adapter can be troubleshooted using a number of techniques available in ESXi 5.0. First there is the esxcli fcoe command namespace that can be used to get adapter & nic information:

Esxcli-fcoe
The dcbd daemon on the ESXi host will record any errors with discovery or communication. This daemon logs to /var/log/syslog.log. For increased verbosity, dcdb can be run with a '-v' option. There is also a proc node created when the software FCoE adapter is created, and this can be found in /proc/scsi/fcoe/<instance>. This contains interesting information about the FCoE devices discovered, and this proc node should probably be the starting point for an FCoE troubleshooting. Another useful utility is ethtool. When this command is run with the '-S' option against a physical NIC, it will display statistics including fcoe errors & dropped frames. Very useful.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage