posted

2 Comments

Cloning virtual machines is an area where VAAI can provide many advantages. Flash storage arrays provide excellent IO performance. We wanted to see what difference VAAI makes in virtual machine cloning operations for “All Flash Arrays”.

The following components were used for testing VAAI performance on an all Flash storage array:

  1. Dell R910 server with 40 cores and 256 GB RAM
  2. Pure FA-400 Flash Array with two shelves that included 44 238 GB Flash drives and 8.2 TB usable capacity.
  3. Centos Linux Virtual Machine with 4 vCPU, 8 GB RAM,  16 GB OS/Boot Disk & 500 GB Data Disk all on the Pure Storage Array
  4. SW ISCSI on dedicated 10GBPS ports.

Test Virtual Machine:

The virtual machine used for testing was a generic Centos Linux based system with a second virtual data disk with 500GB Capacity.  To make the cloning process be truly exercised, we want this data disk to be filled with random data. Making the data random ensures that the data being copied is not repetitive in any way and is not easily compressed or de-duplicated.

 Preparing the Data Disk:

The following command was used to create a large 460 GB file with random data with “dd” command on Linux.

dd if=/dev/urandom of=/thinprov/500gb_file bs=1M count=4600000

The disk space used in the data disk is shown below and it contains only the random data file generated with dd command.

root@linux01 thinprov]# df

Filesystem           1K-blocks      Used Available Use% Mounted on

/dev/mapper/VolGroup00-LogVol00 10220744   2710700   6982480  28% /

/dev/sda1               101086     20195     75672  22% /boot

tmpfs                         4087224         0   4087224   0% /dev/shm

/dev/sdb1            516054864 469853428  19987376  96% /thinprov

 Tuning for VAAI and best performance:

VAAI can be enabled or disabled using the following settings: (1 enables, 0 Disables)

esxcli system settings advanced set –int-value 1 –o /DataMover/HardwareAcceleratedMove

esxcli system settings advanced set –int-value 1 -o /DataMover/HardwareAcceleratedInit

esxcli system settings advanced set –int-value 1 -o /VMFS3/HardwareAcceleratedLocking

esxcli system settings advanced set –int-value 1 -o /VMFS3/EnableBlockDelete

Adjust Maximum HW Transfer size for better copy performance:

esxcli system settings advanced set –int-value 16384 –option /DataMover/ 
MaxHWTransferSize

For larger I/O sizes its found in experiments that settings IOPS to 1 have a positive effect on latency

esxcli storage nmp psp roundrobin deviceconfig set –d <device> -I 1 -t iops

On ESXi 5.5, DSNRO can be set on a per LUN basis!

esxcli storage core device set -d <device> -O 256

Set Disk SchedQuantum to maximum (64)

esxcli system settings advanced set –int-value 64 –o /Disk/SchedQuantum

Phase 1: Cloning with VAAI disabled:

For the first phase of the study VAAI was turned off and the settings validated. The cloning process was initiated for the Linux virtual machine and some of the key metrics were observed and captured at the storage array and in vCenter performance charts.

The cloning process was carefully monitored and the time for the cloning operation was observed to be 63 minutes.

VAAI1

 

The time in the chart between 2:06 and 3:09 PM represents the cloning operation shown as the blue area. There is a spike in latency (>2ms), IOPS (5000) and Bandwidth utilization around 420 MBPS during this cloning operation.

VAAI2

Phase 2: Cloning with VAAI Enabled:

For the second phase of the study VAAI was turned on and the settings validated. The cloning process was initiated for the Linux virtual machine and some of the key metrics were observed and captured at the storage array and in vCenter performance charts.

The cloning process was carefully monitored and the time for the cloning operation was observed to be 19 minutes.

VAAI3

 

The time in the chart between 3:54 and 4:13 PM represents the cloning operation shown as the blue area. There is a minimal spike in latency (0.5ms), IOPS (3000) and Bandwidth utilization around 10 MBPS during this cloning operation.

VAAI4

The performance chart for network usage does not correlate with the 10 MBPS average utilization during the cloning operation. The network utilization at the vSphere host level during the operation shows no increase in network utilization as was seen with the Non VAAI operation. This clearly shows that all the network activity occurs within the storage array with no impact the vSphere host.

Effect of VAAI on the cloning operation:

The observations highlight the huge impact that VAAI has on a large copy operation represented by a VM clone. A clone of a VM with 500 GB of random data benefits significantly through the use the use VAAI compliant storage as shown in the following table.

VAAI5

 

Arrays that are VAAI capable such as the Pure Storage array used in this study dramatically improves write intensive operations such as cloning by reducing time of impact, latency, IOPS and bandwidth consumed. This study shows that even all flash arrays that have fast disks with huge IOPS can significantly benefit from VAAI for cloning