VCF Storage (vSAN) Technical

Network Traffic Separation in vSAN Storage Clusters for VMware Cloud Foundation 9.0

Since the introduction of vSAN storage clusters (previously known as “vSAN Max”), we’ve seen a lot of enthusiasm from our customers looking to deploy it in their environment. Whether you are starting small, or deploying a large environment, the appeal is clear. vSAN storage clusters are a great way to provide centralized shared storage for vSphere clusters. Powered by vSAN Express Storage Architecture (ESA), you can manage and scale your storage independent from your compute resources, and is an easy way to consume your complimentary vSAN capacity licensing included with VMware Cloud Foundation (VCF).

vSAN storage clusters in VCF 9.0 adds a new configuration option that improves storage performance, as well as providing better network traffic isolation and security. The capability also brings all new levels of flexibility in the design of our VCF environment.

Network Traffic Separation for vSAN Storage Clusters

vSAN has historically transmitted all vSAN traffic over a single VMkernel interface on each host. This made sense for an aggregated vSAN HCI deployment because the datastore serving the VMs were on the same cluster as the VMs. The vSAN traffic flowing through this VMkernel port included:

  • Guest VM I/O. This is I/O sent to and from the guest VM, accessing the vSAN datastore. This is sometimes referred to as “front-end” or “north/south” I/O.
  • vSAN I/O. This is I/O sent across the cluster to ensure data is stored in a resilient way, and also includes reads and writes as data is rebalanced, or repaired after a failure. This is sometimes referred to as “back-end,” “intra-cluster” or “east/west” I/O, and tends to consume more networking resources than guest VM I/O.

In a disaggregated environment using vSAN storage clusters, the VM instances are running on vSphere hosts that mount the datastore of the vSAN storage cluster. Since a disaggregated deployment may mean that vSphere hosts mounting the datastore may reside in other racks or rooms, separating these traffic types to their own VMkernel ports would better align with these topology considerations.

vSAN for VCF 9.0 introduces the ability to separate this type of network traffic when deploying vSAN storage clusters. Each host of the vSAN storage cluster can have one VMkernel port tagged for “vSAN storage cluster client” traffic, and one for “vSAN” traffic.

Figure. Network traffic separation for VMkernel traffic when using vSAN storage clusters.

When deploying a vSAN storage cluster, using this new feature is as simple as creating one VMkernel port and tagging it as “vSAN storage cluster client” traffic, and another VMkernel port and tagging it as “vSAN” traffic. The former is used to transmit guest I/O to and from the vSphere clusters mounting the storage cluster’s datastore, while the latter is used by the storage cluster for back-end storage activities like resilience, rebalancing, and repairs. The VMkernel port tagged for vSAN client traffic can easily run on a different VLAN and subnet than the VMkernel port tagged for vSAN storage cluster traffic.

Figure. VMkernel port tags available for a vSAN storage cluster.

With vSphere clusters that mount the datastore of the vSAN storage cluster, these vSphere hosts will only use one traffic type. These vSphere hosts should only use the VMkernel port tagged as “vSAN” and not “vSAN storage cluster client.”

With most of the vSAN ReadyNode profiles certified for vSAN storage clusters, the back-end vSAN traffic will require 25Gb or higher networking. The minimum bandwidth required for vSAN storage cluster client traffic is 10Gb.

Note that the network traffic separation feature is only available when you deploy a vSAN cluster using the “vSAN storage cluster” deployment option. It is not available when a vSAN HCI deployment option has been selected.

Recommendation: For all vSAN traffic, use a networking teaming policy of “Active/Standby” with “Route based on originating virtual port ID.” Other teaming options such as active/active using LBT are not appropriate for vSAN storage traffic. While active/active using LBT is currently the default for VCF, it is not recommended for vSAN storage traffic. You can change the VMkernel ports tagged for vSAN to an Active/Standby arrangement described above without issue.

Benefits

Separating vSAN traffic with a vSAN storage cluster provides several benefits to your infrastructure.

  • Better Performance. Network traffic separation offsets any performance impact from the physical characteristics of a disaggregated deployment. Using network traffic separation, a vSAN storage is on par with vSAN HCI clusters in terms of performance.
  • Improved network efficiency. Using dedicated VMkernel ports for back-end traffic ensures that this type of traffic stays within the top-of-rack (ToR) switches. It also ensures that only front-end guest I/O traffic traverses the spine potentially reducing network across the spine.
  • More prescriptive, cost-efficient designs. Designs can be easily tailored to accommodate requirements in a cost-efficient manner. For example, a pair of high-performance 25/100Gb switches can be easily used just for the purpose of the back-end vSAN traffic while the ToR switching can remain. This is a great way to drive down costs of the initial investment.
  • Improved security. If your security requirements dictate storage storage traffic should be isolated from all other types of traffic, network traffic separation for vSAN storage clusters can achieve this. The two traffic types can be logically separated using VLANs, or physically separated using different physical switches.

Configuration Examples

Lets explore a few ways this could be used in your deployment of a vSAN storage cluster. The option that is best for you will be based on your topology, physical constraints, and requirements.

Example 1: Isolation within Top-of-Rack Switches

In this example, the vSAN storage cluster is configured to use network traffic separation. While both traffic types use the same ToR switches, they will use different VLANs. Back-end vSAN traffic will remain within the ToR switches, and vSAN storage cluster client traffic will traverse the spine to the vSphere clusters mounting the vSAN datastore.

Figure. vSAN storage cluster traffic isolated within ToR switches.

Advantages: This simple configuration offers logical separation of vSAN traffic using VLANs, and can be achieved with a pair of ToR switches that meets the performance needs of the vSAN storage cluster. This generally offers more flexibility on your vSphere Distributed Switch (VDS) configuration, as all of the uplinks for the VDS’ are going to the same ToR switches.

Trade-offs: One may run out of available network ports at the ToR switches. This depends on the host count of the vSAN storage cluster, the number of uplinks used for each host, and the number of ports in each ToR switch. It also does not keep back-end vSAN traffic physically separated from the front-end vSAN traffic.

Example 2: Isolation within Dedicated Switches

In this example, the vSAN storage cluster is also configured to use network traffic separation. Instead of all host uplinks connecting to the ToR switches, the host uplinks carrying the vSAN back-end traffic uses a dedicated pair of switches, while the uplinks used for the vSAN storage cluster client network will use the ToR switches for the vSphere clusters in other racks mounting the vSAN datastore.

Figure. vSAN storage cluster traffic isolated within dedicated switches.

Advantages: This configuration offloads the burden of existing ToR switches that may be limited in bandwidth or port count. This may also be a friendlier configuration for your network team, who may prefer dedicated switches that are just for storage. This scenario can also be easier for the virtualization administrator who may be allowed to manage and control this separate set of switches.

Trade-offs: There will be an additional expense and rack space for dedicated switches, and the pair of switches will consume an additional 2U in a rack. This configuration also requires a separate VDS for uplinks going to the dedicated switches from the VDS used for the uplinks to the ToR switches.

Recommendation: When possible, keep all of the hosts that comprise a vSAN storage cluster in the same rack. In most cases, this will be no more than about 16, 2U servers in a 42U rack. When paired with network traffic separation, this will ensure that the back-end vSAN traffic will not need to traverse the network spine.

For more recommendations on the design and deployment of vSAN storage clusters, see the document “Design and Operational Guidance for vSAN Storage Clusters.”

Summary

vSAN storage clusters in VCF 9.0 are a great way to provide centralized shared storage for your VCF environment using storage you already own. Network traffic separation improves its flexibility, efficiency and performance.

@vmpete

***

Ready to get hands-on with VMware Cloud Foundation 9.0?  Dive into the newest features in a live environment with Hands-on Labs that cover platform fundamentals, automation workflows, operational best practices, and the latest vSphere functionality for VCF 9.0.