VMware Virtual SAN Operations: Replacing Disk Devices

VSAN-Ops-LogoIn my previous Virtual SAN operations article, “VMware Virtual SAN Operations: Disk Group Management” I covered the configuration and management of the Virtual SAN disk groups, and in particular I described the recommended operating procedures for managing Virtual SAN disk groups.

In this article, I will take a similar approach and cover the recommended operating procedures for replacing flash and magnetic disk devices. In Virtual SAN, drives can be replaced for two reasons; failures, and upgrades. Regardless of the reason whenever a disk device needs to be replaced, it is important to follow the correct decommissioning procedures.

Replacing a Failed Flash Device

The failure of flash device renders an entire disk group inaccessible (i.e. in the “Degraded” state) to the cluster along with its data and storage capacity.  One important observation to highlight here is that a single flash device failure doesn’t necessarily mean that the running virtual machines will incur outages. As long as the virtual machines are configured with a VM Storage Policy with “Number of Failures to Tolerate” greater than zero, the virtual machine objects and components will be accessible.  If there is available storage capacity within the cluster, then in a matter of seconds the data resynchronization operation is triggered. The time for this operation depends on the amount of data that needs to be resynchronized.

When a flash device failure occurs, before physically removing the device from a host, you must decommission the device from Virtual SAN. The decommission process performs a number of operations in order to discard disk group memberships, deletes partitions and remove stale data from all disks. Follow either of the disk device decommission procedure defined below.

Flash Device Decommission Procedure from the vSphere Web Client

  1. Log on to the vSphere Web Client
  2. Navigate to the Hosts and Clusters view and select the cluster object
  3. Go to the manage tab and select Disk management under the Virtual SAN section
  4. Select the disk group with the failed flash device
  5. Select the failed flash device and click the delete button

Note: In the event the disk claim rule settings in Virtual SAN is set to automatic the disk delete option won’t be available in the UI. Change the disk claim rule to “Manual” in order to have access to the disk delete option.

Flash Device Decommission Procedure from the CLI (ESXCLI) (Pass-through Mode)

  1. Log on to the host with the failed flash device via SSH
  2. Identify the device ID of failed flash device
    • esxcli vsan storage list

SSD-UUID

  1. delete the failed flash device from the disk group
    • esxcli vsan storage remove -s <device id>

SSD-UUID-CLI

Note: Deleting a failed flash device will result in the removal of the disk group and all of it’s members.

  1. Remove the failed flash device from the host
  2. Add a new flash device to host and wait for the vSphere hypervisor to detect it, or perform a device rescan.

Note: These step are applicable when the storage controllers are configured in pass-though mode and support hardware hot-plug feature.

Upgrading a Flash Device

Before upgrading the flash device, you should ensure there is enough storage capacity available within the cluster to accommodate all of the currently stored data in the disk group, because you will need to migrate data off that disk group.

To migrate the data before decommissioning the device, place the host in maintenance mode and choose the suitable data migration option for the environment. Once all the data is migrated from the disk group, follow the flash device decommission procedures before removing the drive from the host.

Replacing a Failed Magnetic Disk Devices

Each magnetic disk is accountable for the storage capacity it contributes to a disk group and the overall Virtual SAN datastore. Similar to flash, magnetic disk devices can be replaced for failures or upgrade reasons. The impact imposed by a failure of a magnetic disk is smaller when compared to the impact presented by the failure of a flash device. The virtual machines remain online and operational for the same reasons described above in the flash device failure section.  The resynchronization operation is significantly less intensive than a flash device failure. However, again the time depends on the amount of data to be resynchronized.

As with flash devices, before removing a failed magnetic device from a host, decommission the device from Virtual SAN first. The action allows Virtual SAN to perform the required disk group and devices maintenance operations as well as allow the subsystem components to update the cluster capacity and configuration settings.

vSphere Web Client Procedure (Pass-through Mode)

  1. Login to the vSphere Web Client
  2. Navigate to the Hosts and Clusters view and select the Virtual SAN enabled cluster
  3. Go to the manage tab and select Disk management under the Virtual SAN section
  4. Select the disk group with the failed magnetic device
  5. Select the failed magnetic device and click the delete button

Note: It is possible to perform decommissioning operations from ESXCLI in batch mode if required. The use of the ESXCLI does introduces a level of complexity that should be avoided unless thoroughly understood. It is recommended to perform these types of operations using the vSphere Web Client until enough familiarity is gained with them.

Magnetic Device Decommission Procedure from the CLI (ESXCLI) (Pass-through Mode)

  1. Login to the host with the failed flash device via SSH
  2. Identify the device ID of failed magnetic device
    • esxcli vsan storage listmag-change
  3. delete the magnetic device from the disk group
    • esxcli vsan storage remove -d <device id>HDD-UUID-CLI
  4.  Add a new magnetic device to the host and wait for the vSphere hypervisor to detect it, or perform a device rescan.

Upgrading a Magnetic Disk Device

Before upgrading any of the magnetic devices ensure there is enough usable storage capacity available within the cluster to accommodate the data from the device that is being upgraded. The data migration can can be initiated by placing the host in maintenance mode and choosing a suitable data migration option for the environment. Once all the data is offloaded from the disks, proceed with the magnetic disk device decommission procedures.

In this particular scenario, it is imperative to first decommission the magnetic disk device before physically removing from the host. If the disk is removed from the host without performing the decommissioning procedure, data that is cached from that disk will end up being permanently stored in the cache layer. This could reduce the available amount of cache and eventually impact the performance of the system.

Note: The disk device replacement procedures discussed in this article are entirely based on storage controllers configured in pass-through mode. In the event the storage controllers are configured in a RAID0 mode, follow the manufactures instructions for adding and removing disk devices.

– Enjoy

For future updates on Virtual SAN (VSAN), Virtual Volumes (VVols), and other Software-defined Storage technologies as well as vSphere + OpenStack be sure to follow me on Twitter: @PunchingClouds

Virtual SAN Backup with VDP – New White Paper

Hot off of the press: A new white paper that discusses backing up virtual machines running on VMware Virtual SAN (VSAN) using VMware vSphere Data Protection (VDP).vsan_vdp_white_paper These are the main topics that are covered:

  • VDP Architectural Overview
  • Virtual SAN Backup using VDP
  • Factors Affecting Backup Performance

The paper details test scenarios, how backup transport modes affect CPU and memory utilization of the VDP virtual appliance, and how the vSphere hosts management network is impacted when the Network Block Device over Secure Sockets Layer (NBDSSL) transport mode is utilized. The paper concludes with a summary of observations, recommendations when deploying the VDP virtual appliance to a Virtual SAN datastore, and some discussion around transport modes and running concurrent backups. A special thank you goes to Weiguo He for compiling this data and writing this paper!

Click here to view/download VMware Virtual SAN Backup Using VMware vSphere Data Protection

@jhuntervmware

VMware Configuration Guide for Virtual SAN HCL Component Updates

The Virtual SAN Configuration Guide has been updated with new components. We recently certified 12 SSDs, updated 4 existing SSD certifications, and updated firmware information for 2 HDDs. Make sure to visit the VMware Configuration Guide for Virtual SAN for more details!

Here is a list of changes:

New SSDs
•  HGST HUSML4040ASS600
•  HGST HUSML4020ASS600
•  HGST HUSML4040ASS601
•  HGST HUSML4020ASS601
•  HGST HUSSL4040BSS600
•  HGST HUSSL4020BSS600
•  HGST HUSSL4010BSS600
•  HGST HUSSL4040BSS601
•  HGST HUSSL4020BSS601
•  HGST HUSSL4010BSS601
•  NEC S3700 400GB SATA 2.5 MLC RPQ
•  NEC N8150-712

Updated SSD Certifications
• Samsung SM1625 800GB SAS SSD1
• Cisco UCS-SD800G0KS2-EP
• EMC XtremSF1400 PCIEHHM-1400M
• EMC XtremSF700 PCIEHHM-700M

Updated Diskful Writes per Day (DWPD) for Samsung and Cisco drives
A new firmware, B210.06.04, was certified for EMC PCI-E SSDs

HDD Firmware Information Updates
•  Fujitsu HD SAS 6G 1.2TB 10K HOT PL 2.5” EP
•  Hitachi 6Gbps,900GB,10000r/min,2.5in.

 

Operationalizing VMware Virtual SAN: Configuring vCenter Alarms

VMware Virtual SAN has received amazing response from the virtualization community. Now as more and more customers are completing the acquisition and implementation processes, we are receiving more requests for operational guidance. Day 2 operations is perhaps my favorite topic to explore. Essentially the questions asked can be summed up as “Ok, I have done the research, proved the concept, and now have this great new product. Help me know the recommended practices to monitor, manage, and troubleshoot the inevitable issues that pop up with any software”. This question is the driver behind our new blog series, “Operationalizing VMware Virtual SAN“.

In this series, our aim is to take your most frequently asked questions around Virtual SAN Operations and provide detailed recommendations and guidance. In our first article in this series we look to answer the question “How do I configure vCenter Alarms for Virtual SAN?

(Many thanks to William Lam (@vGhetto), Christian Dickmann (@cdickmann), Rawlinson Rivera (@PunchingClouds), and Ken Werneburg (@vmKen) for their much appreciated interest and contribution to this series): [Joe Cook: @CloudAnimal]

Continue reading

Hear the Complete Software-Defined Hyper-Convergence Storage Story with VMware and Nexenta on 11/19

Get your notepads and pens ready, because we’re co-hosting a webinar with Nexenta, on November 19, at 8 a.m. PST detailing our complete, software-defined, hyper-convergence infrastructure offering. Join this webinar to learn how Virtual SAN and file services will fit in your environment, what Software-Defined Storage has to offer your organization and how your business can benefit. Screen Shot 2014-11-11 at 2.10.35 PM

VMware’s own Rawlinson Rivera, Senior Technical Marketing Architect, will co-host the webinar with Nexenta’s Michael Letschin, Director, Product Management, Solutions. During this webinar, we’ll discuss:

  • Storage provisioning and management of VMware Virtual SAN’s hypervisor-converged storage
  • Merging VMware Virtual SAN with VMware EVO: RAIL into a hyper-converged infrastructure that combines compute, networking and storage resources
  • How NexentaConnect for VMware Virtual SAN enables better file services, snapshot and self-service file recovery
  • How Nexenta can support a variety of workloads and business-critical situations through its Software-Defined Storage solutions

Register for this webinar and learn how to build on your VMware Virtual SAN instance with Nexenta!

For more updates on VMware Virtual SAN and Software-Defined Storage, be sure to follow us on Twitter at @VMwareVSAN and ‘like’ us on Facebook at https://www.facebook.com/vmwarevsan!

Oregon State University uses VMware Virtual SAN for their growing VDI environment

Oregon State University, a public institution with more than 26,000 students and growing VDI workloads wanted a high performance storage tier for their VDI environment. However, they wanted the solution to be up and running before the school summer session began, along with being easy to operate and scale on an on-going basis, without requiring large upfront investments.

Continue reading

vSphere PowerCLI 5.8 SPBM Walkthrough (Part4): Provisioning a new VM

powercli 5.8 iconWelcome to the next installment of our vSphere PowerCLI 5.8 walkthrough series of the new cmdlets for vSphere Storage Policy Based Management. So far we have seen:

Introduction to vSphere Storage Policies
Creating vSphere Storage Policies
Associating vSphere Storage Policies

In this article we will take the next step and illustrate how to leverage vSphere Storage Policies to enhance the provisioning of New VMs. We will have a few provisioning examples involving a virtual machine with a single traditional storage array backed datastore, a vsanDatastore, and a multi-vendor mixed datastore environment.

PowerCLI cmdlets referenced in this blog article:

New-VM
Get-SpbmCompatibleStorage
Get-SpbmEntityConfiguration
Set-SpbmEntityConfiguration

Follow these links for more information on creating vSphere Storage Policies for Virtual SAN:

Using the vSphere Web Client
Using PowerCLI

Continue reading

VMware Virtual SAN Performance Testing – Part IV

Virtual SAN ObserverIn Part I, Part II and Part III of this blog post series, we reviewed methods of running benchmark tests on a Virtual SAN cluster using three different methods; synthetic I/O Tools such is Iometer, pre-created application I/O trace replay files available for download, or custom created application I/O trace replay. Once you are running benchmark testing, there will be the need to assess and analyze the performance results of your Virtual SAN cluster, and how they meet the needs of the target applications within your environment . In this post, we will review some key concepts in performing a performance analysis of your Virtual SAN solution.

Continue reading

Do You Need Hardware Guidance to Accelerate Your Virtual SAN Deployment?

It has been 10 months since we released the first set of Virtual SAN Ready Nodes , which are validated server configurations jointly recommended by VMware and Server OEMs to accelerate Virtual SAN deployment. We have been working closely with multiple Server OEM partners to continuously update the list of Virtual SAN Ready Nodes.

The Virtual SAN Ready Node is another great option besides the DIY/Build-your-own option to deploy Virtual SAN, as we had discussed in the past such as in the June 23rd blog.

We have expanded the list from 24 (in June) to 40 Virtual SAN Ready Nodes from eight Server OEMs.

Why should you care about the Virtual SAN Ready Nodes and how do you use them?

RN Benefits

Continue reading

Virtual SAN Ready Nodes – Ready, Set, Go!

What is the VMware Virtual SAN team announcing today?

The VMware Virtual SAN product team is very excited to announce 24 new Virtual SAN Ready Nodes from leading OEM vendors – Dell (3 Ready Nodes), Fujitsu (5 Ready Nodes), HP (10 Ready Nodes) and SuperMicro (6 Ready Nodes)!

Screen Shot 2014-06-21 at 3.29.34 PM

What is a Virtual SAN Ready Node? How “Ready” is it?

Virtual SAN Ready Node is a hyper-converged ready-to-go hardware solution sold by server OEMs which has been pre-configured to run the Virtual SAN in a certified hardware form factor.

The Virtual SAN Ready Nodes include unique and optimized combination of hardware components from the OEM, and may also include software from the OEM for vSphere and Virtual SAN. Virtual SAN Ready Nodes are ideal as hyper-converged building blocks for large datacenter environments with strong automation and a need to customize hardware and software configurations.

OEM vendors offer Virtual SAN Ready Nodes that are unique to their server offerings and include optimized combination of hardware components (I/O controller, HDD, SSD) to run Virtual SAN. In some cases, they also include pre-loaded software for vSphere and Virtual SAN.

So what does a Virtual SAN Ready Node look like?

Virtual SAN Ready Node is a preconfigured ready-to-go hardware solution. Virtual SAN Ready Node is prescriptive in that it provides the size and quantity of CPU, Memory, Network, I/O Controller, HDD and SSD required to run a VDI or Server workload.

For a detailed list of available Ready Nodes from OEM vendors, please refer to the Virtual SAN Ready Node document

But what if I want to choose my own hardware components for Virtual SAN?

Sure, you can do that using the Build Your Own option on the VMware Virtual SAN Compatibility Guide. Using this option, you can pick any certified server, I/O Controller, SSD and HDD from your vendor of choice, decide on the quantity of each components and build out your own Virtual SAN solution.

Alternately, if you are interested in a preconfigured and ready-to-go solution which can be procured faster using a single SKU/Reference ID, go for the Virtual SAN Ready Node!

Virtual SAN Ready Nodes are also prescriptive and are classified under different solution profiles for VDI and Server use cases so we have made it easy for you to pick the Ready Node that best matches your workload profile requirement.

What are the different solution profiles under which Ready Nodes are classified?

Virtual SAN Ready Nodes are classified into Low, Medium and High profiles for Server workloads and Full Clone & Linked Clone profiles for VDI workloads. The solution profiles provide prescriptive hardware recommendations to meet different levels of workload requirements based on the maximum number of VMs (assuming an average instance size for each VM) that can be run per host.

For more details on infrastructure sizing assumptions and design considerations that were made to define sample Ready Node configurations categorized into these solution profiles, please refer to the Virtual SAN Hardware Quick Reference Guide. See snapshot of the document below:

Screen Shot 2014-06-20 at 3.06.27 PM

So how do I choose the right Ready Node for my Virtual SAN?

Visit the VMware Virtual SAN Compatibility Guide website and follow this simple process:

1. Determine your Virtual SAN workload profile requirement for VDI or Server use case.

2. Refer to the node profiles and guidance in Virtual SAN Hardware Quick Reference Guide to determine the approximate configuration that meets your needs

3. Refer to the Virtual SAN Ready Nodes document to identify preconfigured and ready-to-go Virtual SAN Ready Nodes from OEM server vendors.

The server I want is not on the Ready Node list. Will it be supported with Virtual SAN?

As long as the server is certified on the VMware vSphere Compatibility Guide, it will work with Virtual SAN and can be selected as part of the Build Your Own option to build out your Virtual SAN even if it is not one of the standard Virtual SAN Ready Node offerings. This is also true for any certified component like I/O controller, HDD and SSD on the Virtual SAN compatibility guide.

How do I quote/order the Virtual SAN Ready Node from my vendor of choice?

Please contact your OEM sales representative and use the SKU/Reference ID listed for each Ready Node to quote/order the Ready Node from your vendor’s procurement system.

Note: For some of the vendors, the SKUs/Reference IDs are still under works and we expect to get these finalized soon.

Are there more Virtual SAN Ready Nodes from other server vendors to choose from?

Yes, stay tuned. We have more Virtual SAN Ready Nodes from other server vendors coming soon over the next few weeks.

Watch this space for more details!