PMem SAP vSphere vSphere HA

VMware vSphere 7.0 U2 and vSphere HA for SAP HANA with DRAM and Intel® Optane™ PMem in App-Direct Mode

vSphere 7.0 U2 is for SAP HANA an important release since it supports vSphere HA for SAP HANA Intel® Optane™ Persistent Memory enabled SAP HANA VMs.

A full list of “What’s New in vSphere 7 Update 2“ can be found in following blog.

As of SAP note 2937606, vSphere 7.0 U2 is supported for SAP HANA based on the initial vSphere 7.0 support statement. This limits the SAP HANA VM configuration to 4 CPU socket wide VMs with maximal 256 vCPUs and up to 6 TB of vRAM. The 8-socket wide SAP HANA VM validation is yet not finished, and we work with SAP to finish this validation as soon as possible.

Back to the new VMware HA support for PMem enabled SAP HANA VMs and what is so special and important about this feature?

First you must understand that any High Availability (HA) solution requires shared storage for application or VM data. Through the usage of shared storage VMware HA can, for instance in the case of a failure, automatically move and start a VM on another physical ESXi host. If the data is stored on devices which are not shared, then the HA solution is not able to start the application, or in the VMware HA case, the VM, on another physical host.

SAP HANA PMem no HA support

Above picture illustrates that VMware HA can only failover and restart VMs which use shared devises. Since PMem in App-Direct mode is treated as a local, non-sharable datastore, VMware HA is not able to failover and restart PMem enabled HANA VMs.

Because of this requirement, VMware HA was not supported for PMem enabled VMs before the vSphere 7.0 U2 release. Now VMware HA can support the failover and restart of PMem enabled VMs. Requirement is that the applications using PMem maintain data persistence not only on PMem, but also on shared disks.

SAP HANA is one of the applications that provide data persistence on disk. Because of this, VMware HA can use these data on the shared disks to initiate a failover of SAP HANA PMem enabled VMs to another PMem host. VMware HA will automatically re-create the VMs NVDIMM configuration but is not in control over post VM failover OS / application specific configuration steps, like the required re-creation of the SAP HANA DAX device configuration. This must get done manually or via a script, which is not provided by VMware nor SAP. Review Configuration Guide: Intel® OptaneTM DC Persistent Memory and SAP HANA® Platform Configuration for details on how to configure PMem for SAP HANA.

Below picture illustrates the failover of a PMem enabled SAP HANA VM via vSphere 7.0 U2 VMware HA and highlights that the PMem NVDIMM configuration gets automatically re-created as part of the VM failover process. Once the DAX device got configured inside the OS, SAP HANA can be started and will automatically load the data from disk to the new PMem regions assigned to this VM.

SAP HANA PMem no HA with support

After a successful failover of this PMem enabled VM, a garbage collector process will identify failed over VMs and will free up the PMem resources previously used by this VM on the initial host. On the host this VM now runs, the PMem will be blocked and reserved for the live time of this VM (as long it gets not migrated or deleted from the host).

vSphere HA Admission Control PMem Reservations:

Before configuring vSphere HA for PMem enabled VMs, ensure that vSphere HA Admission Control PMem Reservations are set correctly for the cluster.

Admission control is a policy used by vSphere HA to ensure failover capacity within a cluster.

Under Edit Cluster Settings you can select Admission Control to specify the number of failures the host will tolerate.

If you select CPU/Memory reservation defined by:

  • Cluster resource percentage, some amount of persistent memory capacity in the cluster is dedicated for failover purpose even if the virtual machines in the cluster are not using persistent memory currently. This percentage can either be specified through an override, or it is automatically calculated according to the host failures to tolerate setting. When PMem admission control is enabled, PMem capacity is reserved across the cluster even if there are VMs using PMem as disks.
  • Slot Policy (powered-on VMs), persistent memory admission control overrides the Slot Policy with the Cluster Resource Percentage policy, for persistent memory resources only. The percentage value is automatically calculated from the host failures cluster tolerates setting and cannot be overwritten.
  • Dedicated failover hosts, the persistent memory of the dedicated failover hosts is dedicated for failover purpose and you cannot provision virtual machines with persistent memory on these hosts.

After you select an admission control policy, you must also click the Reserve Persistent Memory failover capacity checkbox to enable PMem admission control.

Note: If you use PMem for your SAP HANA VMs and you want to use VMware HA to protect these VMs then you should have a HA node with exact the same PMem configuration or larger as the source host. Otherwise HW configuration differences may cause performance problems.

How to enable VMware HA for SAP HANA VMs?

You can configure vSphere HA for PMem VMs in write-through mode (persistence on disk and PMem), so that when a host fails VMs can be restored on another functioning host.

Prerequisites

  • You must select Hardware version 19.
  • PMem VMs with vPMemDisks are not supported.

Procedure

  1. When creating a new VM in the New Virtual Machine wizard, select Customize hardware.
    1. Click ADD NEW DEVICE and select Add NVDIMM from the drop-down menu.
    2. Click the checkbox Allow failover on another host for all NVDIMM devices.
    3. Click NEXT and complete the New Virtual Machine wizard.

On Host failure, NVDIMM PMem data cannot be recovered. By default, HA will not attempt to restart this virtual machine on another host. Allowing HA on host failure to failover the virtual machine, will restart the virtual machine on another host with a new, empty NVDIMM.

  1. To enable HA on an existing VM, browse to the VM.
    1. Under VM Hardware, click EDIT.
    2. Select the NVDIMM.
    3. Click the checkbox Allow failover on another host for all NVDIMM devices.
    4. Click OK.

On host failure, HA will restart this virtual machine on another host with new, empty NVDIMMs

Check blog: SAP HANA with Intel Optane Persistent Memory on VMware vSphere for an example how to enable PMem HA.

VMware HA customers can also use SAP HANA System Replication or can combine both solutions. For details, please check out following blog.

List of supported VMware vSphere 7.0 U2 features supported for PMem enabled VMs:

SAP HANA PMem vSPhere supported features

*On host failure, VMware HA will restart the PMem enabled VM on another host with new, empty NVDIMMs. Post VM failover OS / application specific configuration steps, like the required re-creation of the SAP HANA DAX device configuration. This must get done manually or via a script, which is not provided by VMware nor SAP.

Operation Best Practices:

It is not recommended to reboot PMem enabled VMs (independent from SAP HANA). Instead of this VMware recommends powering-off the VM followed by a power-on command.

Conclusion:

By providing VMware HA support for PMem enabled SAP HANA VMs, alongside with other benefits of running SAP HANA on vSphere, VMware provides a way to streamline operation and to standardize the HA concept for all SAP HANA VM, regardless of DRAM or PMem with VMware HA.

This reduces not only cost and time, but also complexity and will decrease the failure recovery time and increase therefore the uptime of PMem SAP HANA systems.

We look forward to hearing how you use Intel® Optane™  PMem with SAP HANA in your environment.

Appendix

Below is a link to an example script for the automatic re-creation of the DAX device configuration on the OS level. The script has been developed by the Intel SAP Solution Engineering team and the Intel VMware Center of Excellence.

This script must be executed after the failover and re-start of the VM, prior to the restart of the SAP HANA Database. It is advised to automatically run this script as part of the OS start procedure, like as a custom service.

The script can be used as a template to create your own script that fits your unique environment.

Note: This script is not maintained nor supported by VMware, SAP or Intel, any usage of this script is upon your own responsibility.

References: