Virtual SAN Automatic "Add Disk to Storage Mode" Fails (Part II)

In part 1 of this article, we looked at an interesting scenario in which, despite having the Virtual SAN disk management setting set on automatic, Virtual SAN would not form disk groups around the disks present in the hosts. Upon closer examination, we discovered that the server vendor pre-imaged the drives with NTFS prior to shipping. When Virtual SAN detects an existing partition, it does not automatically erase the partitions and replace it with its own. This serves to protect from accidental drive erasure. Since NTFS partitions already existed on the drives, Virtual SAN was awaiting manual intervention. In the previous article, we displayed the manual steps to remove the existing partitions and allow Virtual SAN to build the disk groups. In this article, we will look at how to expedite the process through scripting.

Warning: Removing disk partitions will render data irretrievable. This script is intended for education purposes only. Please do not use directly in a production environment.

As promised in part 1 of this article, we will demonstrate today how to create your own utility to remove unlocked/unmounted partitions from disks located within your ESXi host. The aim of the script is to provide an example workflow for removing the partitions that insists upon user validation prior to each partition removal. This example workflow can be adapted and built upon to create your own production ready utility.

The script is broken up into 3 major sections:
Section 1: Boot Device Identification
Section 2: Disk Partition Listing
Section 3: Disk Partition Removal

################################################################################################
# Section 1: Boot Device Identification
################################################################################################
bootVolume=`esxcli system boot device get | egrep -i "Boot Filesystem UUID" | awk '{print $4}'`
bootDevice=`esxcfg-scsidevs -f | egrep -i $bootVolume | awk '{print $1}'`

printf "[Section 1: Boot Device]nn"
printf "Your boot volume is: $bootVolumen"
printf "Your boot device is: $bootDevicen"

################################################################################################

# Section 1: Boot Device Identification

################################################################################################

bootVolume=`esxcli system boot device get | egrep -i "Boot Filesystem UUID" | awk '{print $4}'`

bootDevice=`esxcfg-scsidevs -f | egrep -i $bootVolume | awk '{print $1}'`

printf "[Section 1: Boot Device]nn"

printf "Your boot volume is: $bootVolumen"

printf "Your boot device is: $bootDevicen"

As you look at the code above, you will see that we are using two ESXi commands to identify our boot volume and device information.

esxcli system boot device get
esxcfg-scsidevs -f

We take the output from these commands and parse each using egrep and awk. We then store each parsed output in their own variable for reference further in our script. Finally we display this parsed output onto the screen for the user to make note of as we most likely do not want to remove partitions from our boot device.

Note: If we remove partitions from our boot device, we will no longer be able to boot from it until we reformat and reinstall ESXi onto it. It is likely we would receive an error that the boot device partition is in use and would not be able to remove it anyways. However, I believe it is better to be safe than sorry and so take the cautious route. You may note that this boot device is a USB boot device, represented by a mpx designation. Later in our script we will be looking for devices with the naa designation. See here for more information on vSphere Storage Device Naming.

The resulting output should look similar to the following:

[Section 1: Boot Device]

Your boot volume is: f5f57be2-7a746f72-36bb-22b11566308f
Your boot device is: mpx.vmhba32:C0:T0:L0:6

[Section 1: Boot Device]

Your boot volume is: f5f57be2-7a746f72-36bb-22b11566308f

Your boot device is: mpx.vmhba32:C0:T0:L0:6

In our next step, we will identify which disks have existing partitions. In our previous article we displayed the manual steps to identify this using the “ls” command. The ls command will list all of the devices in the directory. If there are existing partitions you will see similar output to the following with disk IDs followed by “:1”, “:2”, etc. In our illustration below we see that disk naa.600508b1001037383941424344450800 has 1 partition.

<span style="color: #666666;font-family: Consolas">~ # ls /dev/disks/naa*
/dev/disks/naa.600508b1001037383941424344450700
/dev/disks/naa.600508b1001037383941424344450800
/dev/disks/naa.600508b1001037383941424344450800:1</span>

<span style="color: #666666;font-family: Consolas">~ # ls /dev/disks/naa*

/dev/disks/naa.600508b1001037383941424344450700

/dev/disks/naa.600508b1001037383941424344450800

/dev/disks/naa.600508b1001037383941424344450800:1</span>

To automate this step, we can leverage the existence of the colon “:” in the directory listing. Only devices with partitions will be listed with a colon in their device name. This provides an easy mechanism for egrep and awk to filter off of. If the directory listing has a colon, than egrep will catch the entire listing and send it to awk to parse out the NAA device ID. This gives us a listing of all disk devices within the host that have existing partitions.

################################################################################################
# Section 2: Disk Partition Listing
################################################################################################
printf "[Section 2: Disk Partitions]nn"
printf "DiskID" &amp;&amp; printf "tttttPartition Numbern"

ls /dev/disks/naa* | egrep ":" | awk -F "/" '{print $4}' | awk -F ":" '{print $1"t"$2}'

################################################################################################

# Section 2: Disk Partition Listing

################################################################################################

printf "[Section 2: Disk Partitions]nn"

printf "DiskID" && printf "tttttPartition Numbern"

ls /dev/disks/naa* | egrep ":" | awk -F "/" '{print $4}' | awk -F ":" '{print $1"t"$2}'

The resulting output should look similar to the following:

[Section 2: Disk Partitions]
DiskID                                  Partition Number
naa.600508b1001037383941424344450500    1
naa.600508b1001037383941424344450800    1

[Section 2: Disk Partitions]

DiskID Partition Number

naa.600508b1001037383941424344450500 1

naa.600508b1001037383941424344450800 1

Our third section is where the disk partitions are actually removed. For this section we leverage my personal favorite conditional statement and that is the “for loop”. The for loop is a fundamental programming construct that takes a listing of items and runs a command or series of commands against each item. In this case we take a listing of disk device IDs, and then for each disk ID, we run the partedUtil command to delete the partition.

The syntax for the for loop is:

for i in `command to generate list of items` ; do 'command to run on list of items' ; done

1	for i in `command to generate list of items` ; do 'command to run on list of items' ; done

The syntax for the partedUtil command to remove the partition is (Note: There the colon separating the device ID from the partition number must be removed for the partedUtil command to run):

partedUtil delete /dev/disks/&lt;disk device ID&gt; &lt;partition number&gt;
- or -
partedUtil delete /dev/disks/naa.600508b1001037383941424344450800 1

partedUtil delete /dev/disks/<disk device ID> <partition number>

- or -

partedUtil delete /dev/disks/naa.600508b1001037383941424344450800 1

Here is our for loop successfully combined with partedUtil:

################################################################################################
# Section 3: Disk Partition Removal
################################################################################################
for i in `ls /dev/disks/naa* | egrep ":" | awk -F "/" '{print $4}'` ; do
printf "npartedUtil delete $diskID $partitionNumber nn"
# partedUtil delete $diskID $partitionNumber
done

################################################################################################

# Section 3: Disk Partition Removal

################################################################################################

for i in `ls /dev/disks/naa* | egrep ":" | awk -F "/" '{print $4}'` ; do

printf "npartedUtil delete $diskID $partitionNumber nn"

# partedUtil delete $diskID $partitionNumber

done

Note: Currently the script section above is set to run in simulation mode by printing the partedUtil command to the screen rather than actually running the partedUtil command on the system. No changes will be made to any environment this script runs on until the printf command is commented and the partedUtil command is uncommented.

In our example script, we are including only the partedUtil command in our for loop. This is for the sake of simplicity and readability of this article. Additional commands can easily be included in order to automate even more steps. For instance we could:

Set disk claim rule to tag disk as local:esxcli storage nmp satp rule add –satp VMW_SATP_LOCAL –device <device id> –option “enable_local”
Set SSD claim rule to tag disk as SSD and as local:esxcli storage nmp satp rule add -satp VMW_SATP_LOCAL -device <device id> -option “enable_local enable_ssd”
Leverage an additional for loop to run against multiple systems

Here is the example script in its entirety. Please feel free to leverage any or all of it as you have need. Happy scripting everyone!

#!/bin/sh
#==============================================================================
# title : Virtual SAN Disk Reset Utility
# description : This script will remove any unlocked/unmounted partitions from disks in your
# ESXi host
# author : Joe Cook, cjoe@vmware.com, Sr Technical Marketing Manager, VMware
# date : 2014.06.01
# version : 0.1
# usage : Remove .txt extension from script file and upload script to ESXi 5.5u1 host
# : Make script executable by typing "chmod 744 SAN Disk Partition Removal Utility.sh"
# : Execute script by typing "./VSAN Disk Partition Removal Utility.sh" on the command line
# ESXi version : 5.5u1
#
# Warning: Removing disk partitions will render data irretrievable
#
# Disclaimer: This script is intended for education purposes only
# Please do not use directly in a production environment.
#==============================================================================
################################################################################################
# Clear screen
################################################################################################
clear
################################################################################################
# Display Title
################################################################################################
printf "----------------------------------------------------------------------n"
printf "[Virtual SAN Tools: Disk Partition Removal Utility]n"
printf "----------------------------------------------------------------------n"
################################################################################################
# Section 1: Boot Device Identification
################################################################################################
bootVolume=`esxcli system boot device get | egrep -i "Boot Filesystem UUID" | awk '{print $4}'`
bootDevice=`esxcfg-scsidevs -f | egrep -i $bootVolume | awk '{print $1}'`
printf "[Section 1: Boot Device]nn"
printf "Your boot volume is: $bootVolumen"
printf "Your boot device is: $bootDevicen"
printf "----------------------------------------------------------------------n"
################################################################################################
# Section 2: Disk Partition Listing
################################################################################################
printf "[Section 2: Disk Partitions]nn"
printf "DiskID" &amp;&amp; printf "tttttPartition Numbern"
ls /dev/disks/naa* | egrep ":" | awk -F "/" '{print $4}' | awk -F ":" '{print $1"t"$2}'
printf "----------------------------------------------------------------------n"
################################################################################################
# Section 3: Disk Partition Removal
################################################################################################
read -r -p "Do you wish to remove any partions? [y/N] " response
case $response in
[yY][eE][sS]|[yY])
printf "nnn----------------------------------------------------------------------n"
printf "[Section 3: Partition Removal]n"
printf "----------------------------------------------------------------------nn"
for i in `ls /dev/disks/naa* | egrep ":" | awk -F "/" '{print $4}'` ; do
diskID=`echo $i | awk -F ":" '{print $1}'`
partitionNumber=`echo $i | awk -F ":" '{print $2}'`
#printf "Device Information for Disk ID: $diskIDn"
printf "Removing partion $partitionNumber from $diskIDn";
printf "----------------------------------------------------------------------n"
# Print Disk Device Information
#esxcli storage core device list -d $diskID
###############################################################
#printf "Removing partion $partitionNumber from $diskID....nn";
#printf "Device Information for Disk ID: $diskIDn"
esxcli storage core device list -d $diskID
read -r -p "Remove partition? [y/N] " response
case $response in
[yY][eE][sS]|[yY]
######################################################################
# printf is being used to simulate the deletion process
# Comment out the printf line and uncomment the partedUtil
# line in order to disengage the simulation
printf "npartedUtil delete $diskID $partitionNumber nn"
# partedUtil delete $diskID $partitionNumber
######################################################################
##### Test for successful partion removal ############################
if [ $? -gt 0 ]; then
printf "ERROR: Removing partion $partitionNumber from $diskIDn";
else
printf "SUCCESS: Removing partion $partitionNumber from $diskIDnnn";
fi
;;
*)
printf "n*** No modifications made to $diskIDn"
printf "............................................................................nnn"
;;
esac
###############################################################
# End For Loop
done
;;
*)
printf "nExiting..........nn"
;;
esac

#!/bin/sh

#==============================================================================

# title : Virtual SAN Disk Reset Utility

# description : This script will remove any unlocked/unmounted partitions from disks in your

# ESXi host

# author : Joe Cook, [email protected], Sr Technical Marketing Manager, VMware

# date : 2014.06.01

# version : 0.1

# usage : Remove .txt extension from script file and upload script to ESXi 5.5u1 host

# : Make script executable by typing "chmod 744 SAN Disk Partition Removal Utility.sh"

# : Execute script by typing "./VSAN Disk Partition Removal Utility.sh" on the command line

# ESXi version : 5.5u1

# Warning: Removing disk partitions will render data irretrievable

# Disclaimer: This script is intended for education purposes only

# Please do not use directly in a production environment.

#==============================================================================

################################################################################################

# Clear screen

################################################################################################

clear

################################################################################################

# Display Title

################################################################################################

printf "----------------------------------------------------------------------n"

printf "[Virtual SAN Tools: Disk Partition Removal Utility]n"

printf "----------------------------------------------------------------------n"

################################################################################################

# Section 1: Boot Device Identification

################################################################################################

bootVolume=`esxcli system boot device get | egrep -i "Boot Filesystem UUID" | awk '{print $4}'`

bootDevice=`esxcfg-scsidevs -f | egrep -i $bootVolume | awk '{print $1}'`

printf "[Section 1: Boot Device]nn"

printf "Your boot volume is: $bootVolumen"

printf "Your boot device is: $bootDevicen"

printf "----------------------------------------------------------------------n"

################################################################################################

# Section 2: Disk Partition Listing

################################################################################################

printf "[Section 2: Disk Partitions]nn"

printf "DiskID" && printf "tttttPartition Numbern"

ls /dev/disks/naa* | egrep ":" | awk -F "/" '{print $4}' | awk -F ":" '{print $1"t"$2}'

printf "----------------------------------------------------------------------n"

################################################################################################

# Section 3: Disk Partition Removal

################################################################################################

read -r -p "Do you wish to remove any partions? [y/N] " response

case $response in

[yY][eE][sS]|[yY])

printf "nnn----------------------------------------------------------------------n"

printf "[Section 3: Partition Removal]n"

printf "----------------------------------------------------------------------nn"

for i in `ls /dev/disks/naa* | egrep ":" | awk -F "/" '{print $4}'` ; do

diskID=`echo $i | awk -F ":" '{print $1}'`

partitionNumber=`echo $i | awk -F ":" '{print $2}'`

#printf "Device Information for Disk ID: $diskIDn"

printf "Removing partion $partitionNumber from $diskIDn";

printf "----------------------------------------------------------------------n"

# Print Disk Device Information

#esxcli storage core device list -d $diskID

###############################################################

#printf "Removing partion $partitionNumber from $diskID....nn";

#printf "Device Information for Disk ID: $diskIDn"

esxcli storage core device list -d $diskID

read -r -p "Remove partition? [y/N] " response

case $response in

[yY][eE][sS]|[yY]

######################################################################

# printf is being used to simulate the deletion process

# Comment out the printf line and uncomment the partedUtil

# line in order to disengage the simulation

printf "npartedUtil delete $diskID $partitionNumber nn"

# partedUtil delete $diskID $partitionNumber

######################################################################

##### Test for successful partion removal ############################

if [ $? -gt 0 ]; then

printf "ERROR: Removing partion $partitionNumber from $diskIDn";

else

printf "SUCCESS: Removing partion $partitionNumber from $diskIDnnn";

;;

printf "n*** No modifications made to $diskIDn"

printf "............................................................................nnn"

;;

esac

###############################################################

# End For Loop

done

;;

printf "nExiting..........nn"

;;

esac

_______________________________________________________________________________________________
References

Using the partedUtil command line utility on ESXi and ESX (VMware KB: 1036609)

Identifying disks when working with VMware ESX/ESXi (VMware KB: 1014953)

Enabling the SSD option on SSD based disks/LUNs that are not detected as SSD by default (VMware KB: 2013188)

vSphere 5.5 Documentation: Understanding Storage Device Naming

Virtual SAN Automatic “Add Disk to Storage Mode” Fails (Part II)

Related Posts:

Related Posts:

Related Articles

VMware vSphere Foundation: Optimizing Private Clouds and Driving IT Value

Embracing Change with VMware vSphere Foundation

Announcing New Collaborations in VMware Private AI