Home > Blogs > VMware vSphere Blog > Tag Archives: iscsi

Tag Archives: iscsi

New iSCSI Best Practices White Paper Available

A new iSCSI best practices white paper is now available. The paper looks at all aspects of using iSCSI in a vSphere environment. The paper was created with the assistance of our partners from DELL & HP, and tries to find common agreement as to what are the best practices. It discusses networking configuration options, interoperability with other vSphere components and advanced settings. You can download the white paper from the VMware Technical Resources site here.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage

Storage Protocol Comparison – A vSphere Perspective

On many occasions I’ve been asked for an opinion on the best storage protocol to use with vSphere. And my response is normally something along the lines of ‘VMware supports many storage protocols, with no preferences really given to any one protocol over another’. To which the reply is usually ‘well, that doesn’t really help me make a decision on which protocol to choose, does it?’

And that is true – my response doesn’t really help customers to make a decision on which protocol to choose. To that end, I’ve decided to put a storage protocol comparison document on this topic. It looks at the protocol purely from a vSphere perspective; I’ve deliberately avoided performance, for two reasons:

  1.  We have another team in VMware who already does this sort of thing.
  2.  Storage protocol performance can be very different depending on who the storage array vendor is, so it doesn’t make sense to compare iSCSI & NFS from one vendor when another vendor might do a much better implementation of one of the protocols

If you are interested in performance, there are links to a few performance comparison docs included at the end of the post.

Hope you find it useful.

vSphere Storage Protocol Comparison Guide

 

iSCSI

NFS

Fiber Channel

FCoE

Description

iSCSI presents block devices to an ESXi host. Rather than accessing blocks from a local disk, the I/O operations are carried out over a network using a block access protocol. In case of iSCSI, remote blocks are accessed by encapsulating SCSI commands & data into TCP/IP packets. Support for iSCSI was introduced in ESX 3.0 back in 2006.

NFS (Network File System) presents file devices over a network to an ESXi host for mounting. The NFS server/array makes its local filesystems available to ESXi hosts. The ESXi hosts access the meta-data and files on the NFS array/server using a RPC-based protocol

VMware currently implements NFS version 3 over TCP/IP. VMware introduced support NFS in ESX 3.0 in 2006.

Fiber Channel presents block devices like iSCSI. Again the I/O operations are carried out over a network using a block access protocol. In FC, remote blocks are accessed by encapsulating SCSI commands & data into fiber channel frames.

One tends to see FC deployed in the majority of mission critical environments.

FC has been the only one of these 4 protocols supported on ESX since the beginning.

Fiber Channel over Ethernet also presents block devices, with I/O operations carried out over a network using a block access protocol. In this protocol, the SCSI commands and data are encapsulated into Ethernet frames. FCoE has many of the same characteristics of FC, except that the transport is Ethernet.

 

VMware Introduced support for HW FCoE in vSphere 4.x & SW FCoE in vSphere 5.0 back in 2011

Implementation Options

1.        NIC with iSCSI capabilities using Software iSCSI initiator & accessed using a VMkernel (vmknic) port

Or:

2.        Dependant Hardware iSCSI initiator

Or:

3.        Independent Hardware iSCSI initiator

Standard NIC accessed using a VMkernel port (vmknic)

Requires a dedicated Host Bus Adapter (HBA) (typically two for redundancy & multipathing)

1.        Hardware Converged Network Adapter (CNA)

Or:

2.        NIC with FCoE capabilities using Software FCoE initiator

Speed/Performance considerations

iSCSI can run over a 1Gb or a 10Gb TCP/IP network.

Multiple connections can be multiplexed into a single session, established between the initiator and target

VMware supports jumbo frames for iSCSI traffic, which can improve performance. Jumbo frames sends payloads larger than 1500. Support for jumbo frames with IP storage was introduced in ESX 4, but not on all initiators (KB 1007654 & KB  1009473). iSCSI can introduce overhead on a host’s CPU (encapsulating SCSI data into TCP/IP packets)

 

NFS can run over 1Gb or 10Gb over TCP/IP – NFS also supports UDP, but VMware's implementation does not & required TCP.

VMware supports jumbo frames for NFS traffic, which can improve performance in certain situations.

Support for jumbo frames with IP storage was introduced in ESX 4.

NFS can introduce overhead on a host’s CPU (encapsulating file I/O into TCP/IP packets)

Fiber Channel can run on 1Gb/2Gb/4Gb/8Gb & 16Gb, but 16Gb HBAs must be throttled to run at 8Gb in vSphere 5.0.

Buffer-to-Buffer credits & End-to-End credits throttle throughput to ensure lossless network

This protocol typically affects a host’s CPU the least as HBAs (required for FC) handles most of the processing (encapsulation of SCSI data into FC frames)

This protocol requires 10gb Ethernet.

The point to note with FCoE is that there is no IP encapsulation of the data like there is with NFS & iSCSI, which reduces some of the overhead/latency. FCoE is SCSI over Ethernet, not IP.

This protocol also requires jumbo frames since FC payloads are 2.2K in size and cannot be fragmented.

 


 


 

iSCSI

NFS

Fiber Channel

FCoE

Load Balancing

VMware’s Pluggable Storage Architecture (PSA) provides a Round-Robin Path Selection Policy which will distribute load across multiple paths to an iSCSI target. Better distribution of load with PSP_RR is achieved when multiple LUNs are accessed concurrently.

There is no load balancing per se on the current implementation of NFS as there is only a single session. Aggregate bandwidth can be configured by creating multiple paths to the NAS array, and accessing some datastores via one path, and other datastores via another.

VMware’s Pluggable Storage Architecture (PSA) provides a Round-Robin Path Selection Policy which will distribute load across multiple paths to an FC target. Better distribution of load with PSP_RR is achieved when multiple LUNs are accessed concurrently.

VMware’s Pluggable Storage Architecture (PSA) provides a Round-Robin Path Selection Policy which will distribute load across multiple paths to an FCoE target. Better distribution of load with PSP_RR is achieved when multiple LUNs are accessed concurrently.

Resilience

VMware’s PSA implements failover via its Storage Array Type Plugin (SATP) for all support iSCSI arrays. The preferred method to do this for SW iSCSI is with iSCSI Binding implemented, but it can be achieved with adding multiple targets on different subnets mapped to the iSCSI initiator.

NIC Teaming can be configured so that if one interface fails, another can take its place. However this is relying on a network failure and may not be able to handle error conditions occurring on the NFS array/server side.

VMware’s PSA implements failover via its Storage Array Type Plugin (SATP) for all support FC arrays

VMware’s PSA implements failover via its Storage Array Type Plugin (SATP) for all support FCoE arrays

Error checking

iSCSI uses TCP which resends dropped packets.

NFS uses TCP which resends dropped packets

Fiber Channel is implemented as a lossless network. This is achieved by throttling throughput at times of congestion using B2B and E2E credits

Fiber Channel over Ethernet requires a lossless network. This is achieved by the implementation of a Pause Frame mechanism at times of congestion.

Security

iSCSI implements the Challenge Handshake Authentication Protocol (CHAP) to ensure initiators and targets trust each other.

VLANs or private networks are highly recommended to isolate the iSCSI traffic from other traffic types.

 

VLANs or private networks are highly recommended to isolate the NFS traffic from other traffic types.

Some FC switches support the concepts of a VSAN to isolate parts of the storage infrastructure. VSANs are conceptually similar to VLANS.

 

Zoning between hosts and FC targets also offers a degree of isolation.

Some FCoE switches support the concepts of a VSAN to isolate parts of the storage infrastructure.

 

Zoning between hosts and FCoE targets also offers a degree of isolation.


 


 

iSCSI

NFS

Fiber Channel

FCoE

 

VAAI Primitives

Although VAAI primitives may be different from array to array, iSCSI devices can benefit from the full complement of block primitives:

·          Atomic Test/Set

·          Full Copy

·          Block Zero

·          Thin Provisioning

·          UNMAP

 

These primitives are built-in to ESXi, and require no additional software installed on the host.

Again, these vary for array to array. The VAAI primitives available on NFS devices are:

·          Full Copy (but not with Storage vMotion, only with cold migration)

·          Pre-allocate space (WRITE_ZEROs)

·          Clone offload using native snapshots

 

Note that for VAAI NAS, one requires a plug-in from the storage array vendor.

 

Although VAAI primitives may be different from array to array, FC devices can benefit from the full complement of block primitives:

·          Atomic Test/Set

·          Full Copy

·          Block Zero

·          Thin Provisioning

·          UNMAP

 

These primitives are built-in to ESXi, and require no additional software installed on the host.

Although VAAI primitives may be different from array to array, FCoE devices can benefit from the full complement of block primitives:

·          Atomic Test/Set

·          Full Copy

·          Block Zero

·          Thin Provisioning

·          UNMAP

 

These primitives are built-in to ESXi, and require no additional software installed on the host.

ESXi Boot from SAN

Yes

No

Yes

SW FCoE – No

HW FCoE (CNA) – Yes

RDM Support

Yes

No

Yes

Yes

Maximum Device Size

64TB

Refer to NAS array vendor or NAS server vendor for maximum supported datastore size.

Theoretical size is much larger than 64TB, but requires NAS vendor to support it.

64TB

64TB

Maximum number of devices

256

Default 8,

Maximum 256

256

256

Protocol direct to VM

Yes, via in-guest iSCSI initiator.

Yes, via in-guest NFS client.

No, but FC devices can be mapped directly to the VM with NPIV. This still requires RDM mapping to the VM first, and hardware must support NPIV (SW, HBA)

No

Storage vMotion Support

Yes

Yes

Yes

Yes

Storage DRS Support

Yes

Yes

Yes

Yes

Storage I/O Control Support

Yes, since vSphere 4.1

Yes, since vSphere 5.0

Yes, since vSphere 4.1

Yes, since vSphere 4.1

Virtualized MSCS Support

No. VMware does not support MSCS nodes built on VMs residing on iSCSI storage. However the use of software iSCSI initiators within guest operating systems configured with MSCS, in any configuration

supported by Microsoft, is transparent to ESXi hosts and there is no need for explicit support statements from

VMware. 

No. VMware does not support MSCS nodes built on VMs residing on NFS storage.

Yes, VMware supports MSCS nodes built on VMs residing on FC storage.

No. VMware does not support MSCS nodes built on VMs residing on FCoE storage.


 


 

iSCSI

NFS

Fiber Channel

FCoE

Ease of configuration

Medium – Setting up the iSCSI initiator requires some smarts, simply need the FDQN or IP address of the target. Some configuration for initiator maps and LUN presentation is needed on the array side. Once the target is discovered through a scan of the SAN, LUNs are available for datastores or RDMs.

Easy – Just need the IP or FQDN of the target, and the mount point. Datastore immediately appear once the host has been granted access from the NFS array/server side.

Difficult – Involves zoning at the FC switch level, and LUN masking at the array level once the zoning is complete. More complex to configure than IP Storage. Once the target is discovered through a scan of the SAN, LUNs are available for datastores or RDMs.

Difficult – Involves zoning at the FCoE switch level, and LUN masking at the array level once the zoning is complete. More complex to configure than IP Storage. Once the target is discovered through a scan of the SAN, LUNs are available for datastores or RDMs.

Advantages

No additional hardware necessary – can use already existing networking hardware components and iSCSI driver from VMware, so cheap to implement.

Well known and well understood protocol. Quite mature at this stage.

Admins with network skills should be able to implement.

Can be troubleshooted with generic network tools, such as wireshark.

 

No additional hardware necessary – can use already existing networking hardware components, so cheap to implement.

Well known and well understood protocol.

Also very mature.

Admins with network skills should be able to implement.

Can be troubleshooted with generic network tools, such as wireshark

Well known and well understood protocol.

Very mature, and trusted.

Found in majority of mission critical environments.

Enables converged networking, allowing the consolidation of network and storage traffic onto the same network via CNA – converged network adapter.

Using DCBx (Data Center Bridging protocol), FCoE has been made lossless even though it runs over Ethernet. DCBX does other things like enabling different traffic classes to run on the same network, but that is beyond the scope of this discussion.

Disadvantages

Inability to route with iSCSI Binding implemented.

Possible security issues, as there is no built in encryption, so care must be taken to isolate traffic (e.g. VLANs).

SW iSCSI can cause additional CPU overhead on the ESX host.

TCP can introduce latency for iSCSI.

Since there is only a single session per connection, configuring for maximum bandwidth across multiple paths needs some care and attention.

No PSA multipathing

Same security concerns as iSCSI since everything is transferred in clear text so care must be taken to isolate traffic (e.g. VLANs).

NFS is still version 3, which does not have the multipathing or security features of NFS v4 or NFS v4.1.

NFS can cause additional CPU overhead on the ESX host

TCP can introduce latency for NFS.

Still only runs at 8Gb which is slower than other networks (16Gb throttled to run at 8Gb in vSphere 5.0)

Needs dedicated HBA, FC switch, FC capable storage array which makes an FC implementation rather more expensive

Additional management overhead (e.g. switch zoning) is needed.

Could prove harder to troubleshoot compared to other protocols.

Rather new, and not quite as mature as other protocols at this time.

Requires a 10Gb lossless network infrastructure which can be expensive.

Cannot route between initiator and targets using native IP routing – instead it has to use protocols such as FIP (FCoE Initialization Protocol).

Could prove complex to troubleshoot/isolate issues with network and storage traffic using the same pipe.


Note 1 – I've deliberately skipped AoE (ATA-over-Ethernet) as we have not yet seen significant take-up of this protocol as this time. Should this protocol gain more exposure, I’ll revisit this article.

Note 2 – As I mentioned earlier, I’ve deliberately avoided getting into a performance comparison. This has been covered in other papers. Here are some VMware whitepapers which cover storage performance comparison:

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

iSCSI Advanced Settings

I've had a few questions recently about some of the iSCSI configuration parameters found in the Advanced Settings.

Iscsi-advanced

When iSCSI establishes a session between initiator and target, it has to login to the target. It will try to login for a period of LoginTimeout. If that exceeds, the login fails.

When iSCSI finishes a session between initiator and target, it has to logout of the target. It will try to logout for a period of LogoutTimeout. If that exceeds, the logout fails.

The other options relate to how we determine a DEAD PATH:

  1. RecoveryTimeout is used to determine how long we should wait before placing a path into a DEAD_STATE when the path was active, but now no PDUs are being sent or received. Realistically it’s a bit longer than that, as other considerations are taken into account as well.
  2. The noop settings are used to determine if a path is dead, when it is not the active path. iSCSI  will passively discover if this path is dead by using the noop timeout. This test is carried out on non-active paths every NoopInterval, and if a response isn’t received by NoopTimeout, the path is marked as DEAD.

Unless you wanted faster failover times, you probably wouldn't ever need to edit these. But be careful because if you have paths failing too quickly and then recovering, you could have LUNs/devices moving unnecessarily between array controller targets which could lead to path thrashing.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

 

Nice vmkfstools feature for Extents

Troubleshooting issues with extents has never been easy. If one extent member went offline, it has been difficult to find which physical LUN corresponds to the extent that went offline. vSphere 5.0 introduces the ability, via vmkfstools, to check which extent of a volume is offline. For example, here is a VMFS-5 volume I created which spans two iSCSI LUNs:

~ # vmkfstools -Ph /vmfs/volumes/iscsi_datastore/
VMFS-5.54 file system spanning 2 partitions.
File system label (if any): iscsi_datastore
Mode: public
Capacity 17.5 GB, 16.9 GB available, file block size 8 MB
UUID: 4d810817-2d191ddd-0b4e-0050561902c9
Partitions spanned (on “lvm”):
        naa.6006048c7bc7febbf4db26ae0c3263cb:1
        naa.6006048c13e056de156e0f6d8d98cee2:1
Is Native Snapshot Capable: NO
~ #

Now if something happened on the array side to cause one of the LUNs to go offline, previous versions of vmkfstools would not be able to identify which LUN/extent was the problem, and if investigating from the array side, you would have to look at all the LUNs making up the volume and try to figure out which one was problematic.  Now, in 5.0, we get notification about which LUN is offline:

~ # vmkfstools -Ph /vmfs/volumes/iscsi_datastore/
VMFS-5.54 file system spanning 2 partitions.
File system label (if any): iscsi_datastore
Mode: public
Capacity 17.5 GB, 7.2 GB available, file block size 8 MB
UUID: 4d810817-2d191ddd-0b4e-0050561902c9
Partitions spanned (on “lvm”):
        naa.6006048c7bc7febbf4db26ae0c3263cb:1
        (device naa.6006048c13e056de156e0f6d8d98cee2:1 might be offline)
        (One or more partitions spanned by this volume may be offline)
Is Native Snapshot Capable: NO
~ #

In this case, we can see the NAA id (SCSI identifier) of the LUN which has the problem and investigate why the LUN is offline from the array side. A nice feature I’m sure you will agree.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

~ # vmkfstools -Ph /vmfs/volumes/iscsi_lun0/

VMFS-5.54 file system spanning 2 partitions.

File system label (if any): iscsi_lun0

Mode: public

Capacity 17.5 GB, 7.2 GB available, file block size 8 MB

UUID: 4d810817-2d191ddd-0b4e-0050561902c9

Partitions spanned (on “lvm“):

        naa.6006048c7bc7febbf4db26ae0c3263cb:1

        (device naa.6006048c13e056de156e0f6d8d98cee2:1 might be offline)

        (One or more partitions spanned by this volume may be offline)

Is Native Snapshot Capable: NO

~ #

Why can you not use NIC Teaming with iSCSI Binding?

I had an interesting discussion recently with a number of my colleagues around the requirement to place additional uplinks (vmnics) on a virtual switch that are not used by iSCSI binding into the 'Unused' state. One would think that it might be useful to team these uplinks, placing some in Standby mode so that in the case of a failure on the active link, the iSCSI traffic could  move to the standby uplink. But the requirement is that other vmnics must be put into the 'Unused' state and not teamed. Why?

This requirement prevents the VMkernel port from floating across uplinks in the case of a failure.  The reason for this is that if the physical NIC loses connectivity, it should be treated as a storage path failure, not a network failure. We want the Pluggable Storage Architecture (PSA) in the VMkernel to handle this event and failover to an alternate path to stay connected to the storage.

This approach enables customers to consider storage resiliency based on multiple paths to the storage, rather than basing it on the number of networks available to a single storage path.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

DELL’s Multipath Extension Module for EqualLogic now supports vSphere 5.0

DELL recently released their new Multipath Extension Module (MEM) for the EqualLogic PS Series of storage array. This updated MEM now supports vSphere 5.0.

I guess I should try to explain what a MEM is before going any further. VMware implements a Pluggable Storage Architecture (PSA) model in the VMkernel. This means that storage array vendors can write their own multipathing modules to plugin to the VMkernel I/O path. These plugins can co-exist alongside VMware’s own default set of modules. There are different modules for different tasks in the PSA. For instance, the specific details of handling path failover for a given storage array are delegated to the Storage Array Type Plugin (SATP). SATP is associated with paths. The specific details for determining which physical path is used to issue an I/O request (load balancing) to a storage device are handled by a Path Selection Plugin (PSP). PSP is associated with logical devices. The SATP & PSP are both MEMs (Multipath Extension Modules).

DELL’s MEM is actually a PSP. This means that it will take care of load balancing of I/O requests across all paths to the PS series arrays. DELL created a good Technical Report (TR) on their MEM which can be found here.

I spoke with Andrew McDaniel, one of DELL’s Lead Architects for VMware based in Ireland, and he was able to supply me with some additional information about this MEM. Firstly, since the MEM is essentially a PSP, devices from the EqualLogic array continue to use the Native Multipath Plugin (NMP) from VMware. This handles basic tasks like loading and unloading of MEMs, path discovery and removal, device bandwidth sharing between VMs, etc.

Any ESXi host with the DELL MEM installed will now have an additional Path Selection Policy. VMware ships ESXi with 3 default PSPs, and the DELL MEM makes up the fourth one. The list of installed PSPs can be shown via the command: esxcli storage nmp psp list

Nmp-psp-list

As you can see, the three standard PSPs are shown (VMW_PSP_MRU, VMW_PSP_RR & VMW_PSP_FIXED). The additional PSP is from DELL – DELL_PSP_EQL_ROUTED.

You might ask why would I need this additional PSP on top of the default ones from VMware. Well, the VMware ones are not optimized on a per array basis. Yes, they will work just fine, but they do not understand the behaviour of each of the different back-array. Therefore their behaviour is what could be described as generic.

DELL’s MEM module has been developed by DELL’s own engineering team who understand the intricacies of the EqualLogic array and can therefore design their MEM to perform optimally when it comes to load balancing/path selection.

If we take a look at one of the LUNs from the EqualLogic array using the esxcli storage nmp device list command, we can see which PSP and SATP are associated with that device and its paths:

Nmp-device-list

Here we can see both the SATP and the PSP that the device is using, as well as the number of working paths.  The SATP VMW_SATP_EQL is a VMware default one for EqualLogic arrays. And of course the PSP is DELL_PSP_EQL_ROUTED. The ‘ROUTED’ part refers ot DELL’s MEM being able to intelligently route I/O requests to the array path best suited to handle the request.

What are those ‘does not support device configuration’ messages? These are nothing to worry about. Some MEMs support configuration settings like preferred path, etc. These messages simply mean there are no configuration settings for this MEM.

The other nice part of DELL’s MEM is that it includes a setup script which will prompt for all relevant information, including vSwitch, uplinks, and IP addresses, and correctly setup a vSwitch for iSCSI & heartbeating, saving you a lot of time & effort. Nice job DELL!
If you are a DELL EqualLogic customer, you should definitely be checking this out. Simply login to https://support.equallogic.com and go to Downloads VMware Integration section. You will need a customer login to do this.

You should also be aware that vSphere 5.0 had an issue with slow boot times when iSCSI is configured. To learn more, refer to this blog post and referenced KB article.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage