Business Critical Oracle Workloads have stringent IO requirements and enabling, sustaining, and ensuring the highest possible performance along with continued application availability is a major goal for all mission critical Oracle applications to meet the demanding business SLA’s, all the way from on-premises to VMware Hybrid Clouds
Oracle RAC provides high availability and scalability by having multiple instances access a single database which prevents the server from being a single point of failure. Oracle RAC enables you to combine smaller commodity servers into a cluster to create scalable environments that support mission critical business applications.
Key points to take away from this blog
This blog showcases the validation of vMotion capability for Oracle RAC cluster on VMware vSphere Platform which can be achieved seamlessly without any issue, with appropriate settings in place
- with Change Block Tracking (CBT) enabled for non-shared disk
- with multi-writer attribute set for the shared vmdk’s
Oracle Real Application Clusters (RAC) on VMware vSphere
Oracle Clusterware is portable cluster software that provides comprehensive multi-tiered high availability and resource management for consolidated environments. It supports clustering of independent servers so that they cooperate as a single system.
Oracle Clusterware is the integrated foundation for Oracle Real Application Clusters (RAC), and the high-availability and resource-management framework for all applications on any major platform.
There are two key requirements for Oracle RAC:
• Shared storage
• Multicast Layer 2 networking
These requirements are fully addressed when running Oracle RAC on VMware vSphere, as both shared storage and Layer 2 networking are natively supported.
VMware vSphere HA clusters enable a collection of VMware ESXi hosts to work together so that, as a group, they provide higher infrastructure-level availability for VMs than each ESXi host can provide individually. VMware vSphere HA provides high availability for VMs by pooling the VMs and the hosts on which they reside into a cluster. Hosts in the cluster are monitored and in the event of a failure, the VMs on a failed host are restarted on alternate hosts
Oracle RAC and VMware HA solutions are complementary to each other. Running Oracle RAC on a VMware platform provides the application-level HA enabled by Oracle RAC, in addition to the infrastructure-level HA enabled by VMware vSphere.
More information on Oracle RAC on VMware vSphere can be found at Oracle VMware Hybrid Cloud High Availability Guide – REFERENCE ARCHITECTURE
VMware multi-writer attribute for shared vmdk’s
VMFS is a clustered file system that disables (by default) multiple VMs from opening and writing to the same virtual disk (.vmdk file). This prevents more than one VM from inadvertently accessing the same .vmdk file. The multi-writer option allows VMFS-backed disks to be shared and written to by multiple VMs. An Oracle RAC cluster using shared storage is a typical use case.
By default, the simultaneous multi-writer “protection” is enabled for all. vmdk files ie all VM’s have exclusive access to their vmdk files. So in order for all of the VM’s to access the shared vmdk’s simultaneously, the multi-writer protection needs to be disabled.
KB 1034165 provides more details on how to set the multi-writer option to allow VM’s to share vmdk’s.
In the case of VMware vSphere on VMFS (non vSAN Storage) , vVol (beginning with ESXi 6.5) and NFS datastores, using the multi-writer attribute to share the VMDKs for Oracle RAC requires
- SCSI bus sharing needs to be set to none
- VMDKs must be Eager Zero Thick (EZT) ; thick provision lazy zeroed or thin-provisioned formats are not allowed
- VMDK’s need not be set to Independent persistent
The below table describes the various Virtual Machine Disk Modes
While Independent-Persistent disk mode is not a hard requirement to enable Multi-writer option, the default Dependent disk mode would cause the “cannot snapshot shared disk” error when a VM snapshot is taken. Use of Independent-Persistent disk mode would allow taking a snapshot of the OS disk while the shared disk would need to be backed up separately by a third-party vendor software e.g. Oracle RMAN
Supported and Unsupported Actions or Features with Multi-Writer Flag
**** Important ***
- SCSI bus sharing is left at default and not touched at all in case of using shared vmdk’s. – Leave it alone for RAC with shared vmdk’s
- It’s only used for RAC’s with RDM (Raw Device Mappings) as shared disks.
VMware recommends using shared VMDK (s) with Multi-writer setting for provisioning shared storage for ALL Oracle RAC environments (KB 1034165)
More information on Oracle RAC on VMware vSphere can be found at Oracle VMware Hybrid Cloud High Availability Guide – REFERENCE ARCHITECTURE
Test Use case
This test case showcases the validation of vMotion capability for Oracle RAC cluster on VMware vSphere Platform which can be achieved seamlessly without any issue, with appropriate settings in place
- with Change Block Tracking (CBT) enabled for non-shared disk
- with multi-writer attribute set for the shared vmdk’s
The validation is performed for this infrastructure setup below
- VMware ESXi, 8.0.0 Build 20513097
- Oracle RAC 19.12 with Grid Infrastructure, ASM Storage and ASMLIB on Oracle Enterprise Linux (OEL) 8.5
- Pure X50 FlashArray
Test Bed
The Test bed is a 2 Node vSphere Cluster with 2 ESXi servers, all version ESXi, 8.0.0 Build 20513097.
The server names are ‘sc2esx64.vslab. local’ and ‘sc2esx65.vslab. local’.
The 2 ESXi servers are Super Micro SYS-2049U-TR4 Servers, each server has 4 sockets, 24 cores per socket, Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz with 1.5 B RAM. The vCenter version was 8.0.0 build 20519528.
Storage for the Oracle RAC VM’s are provided by the Pure X50 FlashArray (Purity //F 6.3.3). Details of the VMFS datastore backed by Pure Storage is shown as below.
Oracle RAC VM’s ‘orac19c1’ and ‘0rac19c2’ details are shown as below. Each VM has 8 vCPU’s and 64GB vRAM. RAC database ‘orac19c’ was created with multi-tenant option & provisioned with Oracle Grid Infrastructure (ASM) and Database version 19.15 on O/S OEL 8.5 UEK.
Oracle ASM was the storage platform with Oracle ASMLIB for device persistence. Oracle SGA & PGA set to 32G and 6G, respectively. All Oracle on VMware best practices were followed.
The Public and Private Adapter IP address information is shown as below.
The vmdk’s for the VM ‘orac19c1’ & ‘orac19c2’ are shown as below –
- Hard Disk 1 (SCSI 0:0) – 80G for OS (/)
- Hard Disk 2 (SCSI 0:1) – 80G for Oracle Grid Infrastructure and RDBMS binaries
- Hard Disk 3 (SCSI 1:0) – 500G for Oracle RAC (GIMR, CRS, VOTE, DATA)
Note – We used 1 VMDK for the entire RAC Cluster, as the intention here was – vMotion of Oracle RAC cluster on VMware vSphere Platform with Change Block Tracking (CBT) enabled for non-shared disk can be achieved seamlessly without any issue, with appropriate settings in place
Recommendation is to follow the RAC deployment Guide on VMware for Best Practices with respect to the RAC layout – Oracle VMware Hybrid Cloud High Availability Guide – REFERENCE ARCHITECTURE
The Change Block Tracking (CBT) specific settings for the RAC VM’s is shown as below.
The CBT settings for the vmdk’s for VM ‘orac19c1’ & ‘orac19c2’ are shown as below –
- Hard Disk 1 (SCSI 0:0) – 80G for OS (/) – CBT enabled
- Hard Disk 2 (SCSI 0:1) – 80G for Oracle Grid Infrastructure and RDBMS binaries – CBT enabled
- Hard Disk 3 (SCSI 1:0) – 500G for Oracle RAC (GIMR, CRS, VOTE, DATA) – CBT disabled
In addition, set the CBT disabled at the Global Level (ctkEnabled=FALSE) in both RAC VM’s.
The .vmx file entries for CBT is shown as below for RAAC VM’s ‘orac19c1’ and’orac19c2’
[root@sc2esx65:/vmfs/volumes/60ea3b49-f557ea8e-7a1d-e4434b2d2ca8/orac19c1] cat orac19c1.vmx | grep -i ctk
scsi1:0.ctkEnabled = “FALSE”
scsi0:1.ctkEnabled = “TRUE”
scsi0:0.ctkEnabled = “TRUE”
ctkEnabled = “FALSE”
[root@sc2esx65:/vmfs/volumes/60ea3b49-f557ea8e-7a1d-e4434b2d2ca8/orac19c1]
[root@sc2esx65:/vmfs/volumes/60ea3b49-f557ea8e-7a1d-e4434b2d2ca8/orac19c2] cat orac19c2.vmx | grep -i ctk
scsi1:0.ctkEnabled = “FALSE”
scsi0:1.ctkEnabled = “TRUE”
scsi0:0.ctkEnabled = “TRUE”
ctkEnabled = “FALSE”
[root@sc2esx65:/vmfs/volumes/60ea3b49-f557ea8e-7a1d-e4434b2d2ca8/orac19c2]
Test Steps
The above section shows the configuration details of the Oracle RAC 19.15 with Grid Infrastructure, ASM Storage and ASMLIB on VMware ESXi, 8.0.0 Build 20513097 using Pure X50 FlashArray.
The O/S details are shown as below
[root@orac19c1 ~]# cat /etc/oracle-release
Oracle Linux Server release 8.5
[root@orac19c1 ~]#
[root@orac19c1 ~]# uname -a
Linux orac19c1.vslab.local 5.4.17-2136.302.7.2.1.el8uek.x86_64 #2 SMP Tue Jan 18 12:11:34 PST 2022 x86_64 x86_64 x86_64 GNU/Linux
[root@orac19c1 ~]#
The Cluster services are up as shown below.
We will perform the validation of the vMotion steps for the 2 test cases as shown below.
Test Case 1 – with Global ‘ctkEnabled=FALSE’ and keeping individual disks CBT entries the same as shown below
- vMotion VM ‘orac19c1’ from host ‘sc2esx64’ to host ‘sc2esx65’
- vMotion VM ‘orac19c2’ from host ‘sc2esx65’ to host ‘sc2esx64’
- scsi1:0.ctkEnabled = “FALSE”
- scsi0:1.ctkEnabled = “TRUE”
- scsi0:0.ctkEnabled = “TRUE”
- ctkEnabled = “FALSE”
Test Case 2- without setting Global ‘ctkEnabled=FALSE’ and keeping individual disks CBT entries the same as shown below
- vMotion VM ‘orac19c1’ from host ‘sc2esx64’ to host ‘sc2esx65’
- vMotion VM ‘orac19c2’ from host ‘sc2esx65’ to host ‘sc2esx64’
- scsi1:0.ctkEnabled = “FALSE”
- scsi0:1.ctkEnabled = “TRUE”
- scsi0:0.ctkEnabled = “TRUE”
Baseline (before vMotion) – The RAC VM’s and their ESXi hosts are shown as below
- VM ‘orac19c1’ is on ESXi Host ‘sc2esx64’
- VM ‘orac19c2’ is on ESXi Host ‘sc2esx65’
Test Case 1 – with Global ‘ctkEnabled=FALSE’ and keeping individual disks CBT entries the same as shown below
- vMotion VM ‘orac19c1’ from host ‘sc2esx64’ to host ‘sc2esx65’
- vMotion VM ‘orac19c2’ from host ‘sc2esx65’ to host ‘sc2esx64’
- scsi1:0.ctkEnabled = “FALSE”
- scsi0:1.ctkEnabled = “TRUE”
- scsi0:0.ctkEnabled = “TRUE”
- ctkEnabled = “FALSE”
Select the destination ESXI server as ‘sc2esx65’ and the corresponding VM networks
Select vMotion priority and start vMotion
vMotion of RAC VM ‘orac19c1’ from host ‘sc2esx64’ to host ‘sc2esx65’ completed successfully.
Search for vMotion event in VM ‘orac19c1’ vmware.log file –
[root@sc2esx65:/vmfs/volumes/60ea3b49-f557ea8e-7a1d-e4434b2d2ca8/orac19c1]
…..
2023-01-26T03:00:44.570Z In(05) vmx – MigrateVMXdrToSpec: type: 1 srcIp=<172.16.32.164> dstIp=<172.16.32.165> mid=9e61492a022a895 uuid=00000000-0000-0000-0000-0cc47aff62fc pr
2023-01-26T03:00:44.570Z In(05) vmx – MigrateVMXdrToSpec: encryptedVMotion: 1
2023-01-26T03:00:44.570Z In(05) vmx – MigrateVMXdrToSpec: type 1 unsharedSwap 0 memMinToTransfer 0 cpuMinToTransfer 0 numDisks 0 numStreamIps 1 numFtStreamIps 0
2023-01-26T03:00:44.570Z In(05) vmx – Received migrate ‘from’ request for mid id 713280210969208981, src ip <172.16.32.164>.
2023-01-26T03:00:44.570Z In(05) vmx – MigrateSetInfo: state=8 srcIp=<172.16.32.164> dstIp=<172.16.32.165> mid=713280210969208981 uuid=00000000-0000-0000-0000-0cc47aff62fc pri
2023-01-26T03:00:44.570Z In(05) vmx – MigrateSetState: Transitioning from state 0 to 8.
….
[root@sc2esx65:/vmfs/volumes/60ea3b49-f557ea8e-7a1d-e4434b2d2ca8/orac19c1]
Similar results were obtained when we vMotioned of RAC VM ‘orac19c2’ from host ‘sc2esx65’ to host ‘sc2esx64’ and it completed successfully.
Test Case 2- without setting Global ‘ctkEnabled=FALSE’ and keeping individual disks CBT entries the same as shown below
- vMotion VM ‘orac19c1’ from host ‘sc2esx64’ to host ‘sc2esx65’
- vMotion VM ‘orac19c2’ from host ‘sc2esx65’ to host ‘sc2esx64’
- scsi1:0.ctkEnabled = “FALSE”
- scsi0:1.ctkEnabled = “TRUE”
- scsi0:0.ctkEnabled = “TRUE”
The Change Block Tracking (CBT) specific settings for the RAC VM’s is shown as below.
The CBT settings for the vmdk’s for VM ‘orac19c1’ & ‘orac19c2’ are shown as below –
- Hard Disk 1 (SCSI 0:0) – 80G for OS (/) – CBT enabled
- Hard Disk 2 (SCSI 0:1) – 80G for Oracle Grid Infrastructure and RDBMS binaries – CBT enabled
- Hard Disk 3 (SCSI 1:0) – 500G for Oracle RAC (GIMR, CRS, VOTE, DATA) – CBT disabled
The .vmx file entries for CBT is shown as below for RAAC VM’s ‘orac19c1’ and’orac19c2’
[root@sc2esx65:/vmfs/volumes/60ea3b49-f557ea8e-7a1d-e4434b2d2ca8/orac19c1] cat orac19c1.vmx | grep -i ctk
scsi1:0.ctkEnabled = “FALSE”
scsi0:1.ctkEnabled = “TRUE”
scsi0:0.ctkEnabled = “TRUE”
[root@sc2esx65:/vmfs/volumes/60ea3b49-f557ea8e-7a1d-e4434b2d2ca8/orac19c1]
[root@sc2esx65:/vmfs/volumes/60ea3b49-f557ea8e-7a1d-e4434b2d2ca8/orac19c2] cat orac19c2.vmx | grep -i ctk
scsi1:0.ctkEnabled = “FALSE”
scsi0:1.ctkEnabled = “TRUE”
scsi0:0.ctkEnabled = “TRUE”
[root@sc2esx65:/vmfs/volumes/60ea3b49-f557ea8e-7a1d-e4434b2d2ca8/orac19c2]
start vMotion of RAC VM ‘orac19c1’ from ‘sc2esx64’ to ‘sc2esx65’.
vMotion of RAC VM ‘orac19c1’ from ‘sc2esx64’ to ‘sc2esx65’ completed successfully without any errors
Similar results were observed during vMotion of RAC VM ‘orac19c2’ from ‘sc2esx65’ to ‘sc2esx64’ completed successfully without any errors
Summary
This blog showcases the validation of vMotion capability for Oracle RAC cluster on VMware vSphere Platform which can be achieved seamlessly without any issue, with appropriate settings in place
- with Change Block Tracking (CBT) enabled for non-shared disk
- with multi-writer attribute set for the shared vmdk’s
The validation is performed for this infrastructure setup below
- VMware ESXi, 8.0.0 Build 20513097
- Oracle RAC 19.12 with Grid Infrastructure, ASM Storage and ASMLIB on Oracle Enterprise Linux (OEL) 8.5
- Pure X50 FlashArray
We performed the validation of the vMotion steps for 2 test cases and observed similar results –
- Test Case 1 – with Global ‘ctkEnabled=FALSE’ and keeping individual disks CBT entries
- Test Case 2- without setting Global ‘ctkEnabled=FALSE’ and keeping individual disks CBT entries
Acknowledgements
This blog was authored by Sudhir Balasubramanian, Senior Staff Solution Architect & Global Oracle Lead – VMware.
Conclusion
Business Critical Oracle Workloads have stringent IO requirements and enabling, sustaining, and ensuring the highest possible performance along with continued application availability is a major goal for all mission critical Oracle applications to meet the demanding business SLA’s, all the way from on-premises to VMware Hybrid Clouds
Oracle RAC provides high availability and scalability by having multiple instances access a single database which prevents the server from being a single point of failure. Oracle RAC enables you to combine smaller commodity servers into a cluster to create scalable environments that support mission critical business applications.
All Oracle on VMware vSphere collaterals can be found in the url below
Oracle on VMware Collateral – One Stop Shop
https://blogs.vmware.com/apps/2017/01/oracle-vmware-collateral-one-stop-shop.html