Business Critical Oracle Workloads have stringent IO requirements and enabling, sustaining, and ensuring the highest possible performance along with continued application availability is a major goal for all mission critical Oracle applications to meet the demanding business SLAs on VMware Cloud Foundations (VCF).
VMware Virtual Volumes (vVols) has been the primary focus of VMware storage engineering for the last few releases, and with vSphere 8.0, it is no different. The biggest announcement in vSphere 8.0 core storage is adding vVols support in NVMeoF.
The reason for adding vVols support to NVMeoF is many of the array vendors, and the industry, are moving towards using or at least adding NVMeoF support for better performance and throughput. Subsequently, VMware is making sure vVols remains current with the latest storage technologies.
Key points to take away from this blog
The previous blog Oracle workloads on VMware Virtual Volumes (vVOLS) using Pure Storage FlashArray X50 and Broadcom LPe36000 Fibre Channel Adapter – better performance showcased the performance improvements we got in our lab by deploying Oracle workloads on VMware Virtual Volumes (vVols) on ESXi 7.0.3 using Pure Storage FlashArray X50 and Broadcom LPe36000 Fibre Channel Adapter against SCSI/FCP Luns.
This blog is an exercise to showcase the performance improvements of NVMeoF vVols over SCSI vVols for business-critical production Oracle workloads.
This blog also endeavors to compare a synthetic workload generator SLOB against NVMeoF vVols and SCSI vVols to get an idea of the overall performance improvement in both GOS and Workload throughput and queueing.
Disclaimer
- This blog is not meant to be a performance benchmarking-oriented blog in any
- The test results published below are NOT official VMware Performance Engineering test results and in no way reflect real world workload
Remember
- Any performance data is a result of the combination of hardware configuration, software configuration, test methodology, test tool, and workload profile used in the testing
- So, the performance improvement I got with the synthetic workload generator in my lab is in NO way representative of any real production Customer
- This means the performance improvements for real production Customer workloads will be way much better than what I got in my lab using a synthetic workload
NVMeoF vVols
vSphere 8.0 vVols supports NVMe-FC (NVMe over Fibre Channel) protocol in the data path. This means that vVols are capable of supporting all the commonly used data protocols, be it SCSI, iSCSI, NFS, and now NVMe-FC.
This support has been added via releasing a new VASA Spec, aka VASA version 4, which adds details into the VASA specification on how to support the NVMe protocol with vVols.
The vVols implementation of NVMe protocol extracts the best from both the VASA control path & NVMe IO path for the vVols implementation.
More information on NVMeoF vVols can be found at the
Test Bed
The Test bed is a 3 Node vSphere Cluster with the setup as shown below –
- vCenter version was 8.0.2 build 22385739.
- 3 INSPUR NF5280M6 servers, 2 sockets, 32 cores per socket, Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz, 512GB RAM
- VMware ESXi, 8.0.2, 22380479
- Pure Storage X50R2 with Purity version 6.6.2
- Oracle 21.13 with Grid Infrastructure, ASM Storage and Linux udev (8k database block size)
- OEL UEK 8.9
Each INSPUR NF5280M6 server has 2 sockets, 32 cores per socket, Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz, 512GB RAM as shown below.
INSPUR NF5280M6 server ‘sc2esx67.vslab.local’ HBA & SCSI vVol PE details are as shown below.
INSPUR NF5280M6 server ‘sc2esx67.vslab.local’ HBA & NVMeoF vVol PE details are as shown below.
INSPUR NF5280M6 server ‘sc2esx67.vslab’ SCSI vVol PE and NVMeoF vVol PE details are shown as below.
Mapping of SCSI vVol PE, NVMeoF vVol PE and the SCSI / NVMeoF vVol datastore are as shown below.
SCSI & NVMeoF vVol datastore details are shown as below.
Pure Array X50R2 SCSI vVol PE and Hosts associations are shown as below.
Pure Array X50R2 NVMeoF vVol PE and Hosts associations are shown as below.
Oracle VM ‘Oracle21C-OL8-DB_vNVME’ details are shown as below.
The VM has 32 vCPU’s, 224 GB RAM, the single instance database ‘ORA21C’ was created with created with multi-tenant option & provisioned with Oracle Grid Infrastructure (ASM) and Database version 21.13 on O/S OEL UEK 8.9.
Oracle ASM was the storage platform with Linux udev for device persistence. Oracle SGA & PGA set to 64G and 10G respectively.
The database was deliberately set to NOARCHIVELOG mode as the intention was to drive maximum amount of IO without stressing ARCHIVELOG.
All Oracle on VMware platform best practices were followed.
Oracle VM ‘Oracle21C-OL8-DB_vNVME’ has 4 vNVMe Controllers for added performance.
The vNVMe Controllers & vmdk assignments for Oracle VM ‘Oracle21C-OL8-DB_ESA’ are shown as below:
- NVME 0:0 – 80G for OS (/)
- NVME 0:1 – 80G for Oracle Grid Infrastructure & RDBMS binaries
- NVME 0:2 – 100G for GRID_DG
- NVME 0:3 – 200G for DATA_DG
- NVME 0:4 – 100G for REDO_DG
- NVME 0:5 – 200G for DATA_DG
- NVME 1:0 – 750G for SLOB_DG
- NVME 2:0 – 750G for SLOB_DG
- NVME 3:0 – 750G for SLOB_DG
Oracle ASM Disk Group details are as below.
grid@oracle21c-ol8-nvme:+ASM:/home/grid> asmcmd lsdg
State Type Rebal Sector Logical_Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED EXTERN N 512 512 4096 4194304 409600 31480 0 31480 0 N DATA_DG/
MOUNTED EXTERN N 512 512 4096 4194304 102400 102296 0 102296 0 N GRID_DG/
MOUNTED EXTERN N 512 512 4096 1048576 102400 38235 0 38235 0 N REDO_DG/
MOUNTED EXTERN N 512 512 4096 1048576 2304000 125088 0 125088 0 N SLOB_DG/
grid@oracle21c-ol8-nvme:+ASM:/home/grid>
ASM Diskgroup ‘SLOB_DG’ has 3 vmdk’s , each vmdk 750G size with a total size of 2250GB (AU=1MB).
grid@oracle21c-ol8-nvme:+ASM:/home/grid> asmcmd lsdsk -k
Total_MB Free_MB OS_MB Name Failgroup Site_Name Site_GUID Site_Status Failgroup_Type Library Label Failgroup_Label Site_Label UDID Product Redund Path
204800 15740 204800 DATA_DISK02 DATA_DISK02 00000000000000000000000000000000 REGULAR System UNKNOWN /dev/oracleasm/data_disk02
204800 15740 204800 DATA_DISK03 DATA_DISK03 00000000000000000000000000000000 REGULAR System UNKNOWN /dev/oracleasm/data_disk03
102400 102296 102400 GRID_DISK02 GRID_DISK02 00000000000000000000000000000000 REGULAR System UNKNOWN /dev/oracleasm/grid_disk02
102400 38235 102400 REDO_DISK02 REDO_DISK02 00000000000000000000000000000000 REGULAR System UNKNOWN /dev/oracleasm/redo_disk02
768000 41725 768000 SLOB_DISK01 SLOB_DISK01 00000000000000000000000000000000 REGULAR System UNKNOWN /dev/oracleasm/slob_disk01
768000 41711 768000 SLOB_DISK02 SLOB_DISK02 00000000000000000000000000000000 REGULAR System UNKNOWN /dev/oracleasm/slob_disk02
768000 41652 768000 SLOB_DISK03 SLOB_DISK03 00000000000000000000000000000000 REGULAR System UNKNOWN /dev/oracleasm/slob_disk03
grid@oracle21c-ol8-nvme:+ASM:/home/grid>
Test Use case
This blog is an exercise to showcase the performance improvements of NVMeoF vVols over SCSI vVols for business-critical production Oracle workloads.
This blog also endeavors to compare a synthetic workload generator SLOB against NVMeoF vVols and SCSI vVols to get an idea of the overall performance improvement in both GOS and Workload throughput and queueing.
SLOB workload generator was run against a 2 TB SLOB tablespace ( 85 datafiles , each 25GB size , total size 2125G ).
SLOB 2.5.4.0 was chosen as the load generator for this exercise with following SLOB parameters set as below:
- UPDATE_PCT=0
- RUN_TIME=300
- SCALE=25G
- WORK_UNIT=1024
Multiple SLOB schemas (80 schemas, 25G each , total size 2000GB ) were used to simulate a multi-schema workload model and the number of threads per schema was set to 20.
Work Unit size was set to 1024 to drive the maximum amount of IO without stressing REDO to study the performance differences between NVMeoF vVols and SCSI vVols to get an idea of the overall performance improvement in both GOS and Workload throughput and queueing.
We ran multiple SLOB runs for NVMeoF vVols and SCSI vVols and compared the Oracle metrics as shown in the table below.
We can see that NVMeoF vVols shows a performance improvement over SCSI vVols setup.
Diving deeper into core database load profiles, we NVMeoF vVols shows a performance improvement over SCSI vVols
- Executes (SQL) / second & Transactions / second have an 8% improvement.
- Read IO requests / second have an 8% improvement.
- Read IO (MB) /second have an 8% improvement
Diving deeper into core database wait events, we see ‘db file parallel read’ and ‘db file sequential read’ average wait times have improved for the NVMeoF vVol runs.
Looking at GOS statistics, we can see NVMeoF vVols shows a performance improvement over SCSI vVols with the WIO reducing with the NVMeoF vVol runs.
Diving deep into GOS statistics, we see significant performance improvements of NVMeoF vVols over SCSI vVols
- %iowait reduced around 20% – Indicates GOS spent less CPU cycles on IOWAIT to process same workload
- %idle increased over 20% – Indicates GOS has more CPU to process more work.
From a GOS device utilization perspective, we see significant performance improvements of NVMeoF vVols over SCSI vVols
- %util of devices reduced by 20% – Indicates NVMe technology results in lower CPU cycles spent, improved bandwidth and low latency as compared to SCSI
Disclaimer
- This blog is not meant to be a performance benchmarking-oriented blog in any
- The test results published below are NOT official VMware Performance Engineering test results and in no way reflect real world workload
Remember
- Any performance data is a result of the combination of hardware configuration, software configuration, test methodology, test tool, and workload profile used in the testing
- So, the performance improvement I got with the synthetic workload generator in my lab is in NO way representative of any real production Customer
- This means the performance improvements for real production Customer workloads will be way much better than what I got in my lab using a synthetic workload
In conclusion, we can see that NVMeoF vVols shows a performance improvement over SCSI vVols setup based on the lab results from above.
Summary
This blog is an exercise to showcase the performance improvements of NVMeOF vVols over SCSI vVols for business-critical production Oracle workloads.
This blog also endeavors to compare a synthetic workload generator SLOB against NVMeoF vVols and SCSI vVols to get an idea of the overall performance improvement in both GOS and Workload throughput and queueing.
Disclaimer
- This blog is not meant to be a performance benchmarking-oriented blog in any
- The test results published below are NOT official VMware Performance Engineering test results and in no way reflect real world workload
Remember
- Any performance data is a result of the combination of hardware configuration, software configuration, test methodology, test tool, and workload profile used in the testing
- So, the performance improvement I got with the synthetic workload generator in my lab is in NO way representative of any real production Customer
- This means the performance improvements for real production Customer workloads will be way much better than what I got in my lab using a synthetic workload
Diving deeper into core database load profiles, we NVMeOF vVols shows a performance improvement over SCSI vVols
- Executes (SQL) / second & Transactions / second have an 8% improvement.
- Read IO requests / second have an 8% improvement.
- Read IO (MB) /second have an 8% improvement
Diving deeper into core database wait events, we see ‘db file parallel read’ and ‘db file sequential read’ average wait times have improved for the NVMeoF vVol runs.
Looking at GOS statistics, we can see NVMeoF vVols shows a performance improvement over SCSI vVols with the WIO reducing with the NVMeoF vVol runs.
Diving deep into GOS statistics, we see significant performance improvements of NVMeoF vVols over SCSI vVols
- %iowait reduced around 20% – Indicates GOS spent less CPU cycles on IOWAIT to process same workload
- %idle increased over 20% – Indicates GOS has more CPU to process more work.
From a GOS device utilization perspective, we see significant performance improvements of NVMeoF vVols over SCSI vVols
- %util of devices reduced by 20% – Indicates NVMe technology results in lower CPU cycles spent, improved bandwidth and low latency as compared to SCSI
In conclusion, we can see that NVMeoF vVols shows a performance improvement over SCSI vVols setup based on the lab results from above.
Acknowledgements
This blog was authored by Sudhir Balasubramanian, Senior Staff Solution Architect & Global Oracle Lead – VMware Cloud Foundation (VCF), Broadcom
Conclusion
Business Critical Oracle Workloads have stringent IO requirements and enabling, sustaining, and ensuring the highest possible performance along with continued application availability is a major goal for all mission critical Oracle applications to meet the demanding business SLAs on VMware Cloud Foundations (VCF).
VMware Virtual Volumes (vVols) has been the primary focus of VMware storage engineering for the last few releases, and with vSphere 8.0, it is no different. The biggest announcement in vSphere 8.0 core storage is adding vVols support in NVMeoF.
The reason for adding vVols support to NVMeoF is many of the array vendors, and the industry, are moving towards using or at least adding NVMeoF support for better performance and throughput. Subsequently, VMware is making sure vVols remains current with the latest storage technologies.
All Oracle on VMware vSphere collaterals can be found in the url below.
Oracle on VMware Collateral – One Stop Shop
https://blogs.vmware.com/apps/2017/01/oracle-vmware-collateral-one-stop-shop.html