Oracle vSAN vSAN Max

Announcing VMware vSAN Max Storage for Business-Critical Oracle Workloads

Storage is one of the most important aspect of any IO intensive workloads, Oracle workloads typically fit this bill and we all know how a Tier2 Storage often leads to database performance issues, irrespective of any architecture where the database is hosted on.

Enabling, sustaining, and ensuring the highest possible performance along with continued application availability is a major goal for all mission critical Oracle applications to meet the demanding business SLA’s which can be easily achieved on vSAN Max Storage offering for Business-Critical workloads.

 

 

 

 

 

 

Key points to take away from this blog

 

This blog is an exercise to showcase the advantages of using vSAN Max Storage for business-critical production Oracle workloads for regular use cases provided by traditional storage arrays along with providing incremental scalability, performance, and management capabilities that are associated with HCI – Simplicity with Performance, Scalability and Centralized Management.

Disclaimer – This blog is not meant to be a performance benchmarking-oriented blog in any way. The test results published below are NOT official VMware Performance Engineering test results and in no way reflect real world workload metrics.

Remember, any performance data is a result of the combination of hardware configuration, software configuration, test methodology, test tool, and workload profile used in the testing below – so the performance improvement I got with the  synthetic workload generator in my lab is in NO way representative of any real production Customer workload , which actually means the performance improvements for real production Customer workloads will be way much better.

 

  

 

 

 

 

vSAN Max Storage

 

vSAN Max is a distributed scale-out storage system for vSphere clusters.  It is powered by the vSAN ESA, so it offers the capabilities that are a part of the ESA but serves as a storage-only cluster.  It uses vSAN’s native protocol and data path for cross-cluster communication, which preserves the management experience and provides the highest levels of performance and flexibility for a distributed storage system.

vSAN Max is an ideal shared storage solution for any environment running vSphere clusters.  It can help address all of the common use cases that you may be serving using storage arrays, yet it provides the incremental scalability, performance, and management capabilities that are associated with HCI.

The vSAN Express Storage Architecture introduced an all-new way to process and store datavSAN Max uses the ESA to provide a fully distributed, elastic, shared storage solution for all your vSphere clusters.  vSAN HCI clusters and vSAN Max disaggregated storage clusters are an incredibly powerful combination designed to meet all your needs in the data center and beyond.

 

 

 

 

For each cluster serving your VMware Cloud Foundation environment, vSAN can be provisioned in one of two deployment options:

  • Aggregated clusters, known as “vSAN HCI” or
  • Disaggregated storage using vSAN Max providing storage for vSphere clusters.

A “vSAN HCI” deployment aggregates compute and storage resources into the same hosts that comprise the vSAN cluster, where a “vSAN Max” disaggregates storage resources from compute resources by providing a dedicated storage cluster that can serve as centralized shared storage for your vSphere clusters. Both are built using vSAN ESA.

A new document, “vSAN HCI or vSAN Max – Which Deployment Option is Right for You?” details some of the technical considerations that may help you determine which deployment option may be best for your environment.

More information on vSAN Max Storage can be found at the

 

 

 

 

Test Bed

 

The Test bed is an 8 Node vSAN Max Storage with the setup as shown below –

  • vCenter version was 8.0.2 build 22385739
  • 8 Lenovo ThinkSystem SR665 vSAN Max Ready Node server, 2 sockets, 28 cores per socket, AMD EPYC 7453 28-Core Processor @2.75GHz, 1TB RAM  [ 2 physical NUMA nodes, each NUMA node with 28 cores and 512G ]
  • VMware ESXi, 8.0.2, 22380479 with All NVMe Storage
  • Oracle 21.13 with Grid Infrastructure, ASM Storage and Linux udev (8k database block size)
  • OEL UEK 8.9

 

Details of vSAN Max Storage Cluster ‘env173’ with vSAN Max Storage datastore ‘vsanDatastore_max_amd‘ are shown as below. The vSAN Max Storage has 8 Lenovo ThinkSystem SR665 vSAN Max Ready Node servers as shown below.

 

 

 

 

Each Lenovo ThinkSystem SR665 vSAN Max Ready Node server has 2 sockets, 28 cores per socket, AMD EPYC 7453 28-Core Processor @2.75GHz, 1TB RAM as shown below.

 

 

 

 

Each Lenovo ThinkSystem SR665 vSAN Max Ready Node server has 6 internal NVMe drives, each server in the vSAN Max Cluster contributing 6 Internal NVMe drives to vSAN Max datastore capacity as shown below.

Lenovo ThinkSystem SR665 vSAN Max Ready Node server  ‘env173-node1.pse.lab’ HBA & Internal NVMe drive details are as shown below.

 

 

 

 

Lenovo ThinkSystem SR665 vSAN Max Ready Node server ‘env173-node1.pse.lab’ networking details are shown as below. Every node in this vSAN Max Storage cluster has 2 x 25Gb NIC (Management, vMotion, VM Network etc.) and 1x100Gb NIC for Internal vSAN Traffic.

 

 

 

 

Lenovo ThinkSystem SR665 vSAN Max Ready Node server ‘env173-node1.pse.lab’ VMKernel adapters details are shown as below.

 

 

 

 

The vSAN Max Storage Cluster is mounted over a 100Gb network to a vSAN Client Compute Cluster as shown below.

 

 

 

 

Details of the vSAN Client Compute Cluster are shown as below. The Client Compute Cluster has 8 Lenovo Lenovo ThinkAgile VX7531 servers as shown below.

Each Lenovo ThinkAgile VX7531 server has 2 sockets, 28 cores per socket, Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz, 1TB RAM as shown below.

 

 

 

vSAN Express Storage Architecture (ESA) cluster Storage policies are shown as below.

Baseline Oracle Policy ‘Oracle ESA – FTT0 – NoRAID´ was created to JUST get baseline metrics for comparison purposes, besides that, no true production workload is ever setup with FTT=0 (Failures to Tolerate) and No RAID (no protection).

  • Oracle ESA – FTT0 – NoRAID
  • Oracle ESA – FTT1 – R5
  • Oracle ESA – FTT2 – R6

 

 

 

 

 

 

Oracle VM ‘Oracle21C-OL8-DB_ESA’ details are shown as below.

The VM has 28 vCPU’s, 256 GB RAM, the single instance database ‘ORA21C’ was created with created with multi-tenant option & provisioned with Oracle Grid Infrastructure (ASM) and Database version 21.13 on O/S OEL UEK 8.9.

Oracle ASM was the storage platform with Linux udev for device persistence.  Oracle SGA & PGA set to 64G and 10G respectively.

The database was deliberately set to NOARCHIVELOG mode as the intention was to drive maximum amount of IO without stressing ARCHIVELOG.

All Oracle on VMware platform best practices were followed.

 

 

 

 

Oracle VM ‘Oracle21C-OL8-DB_ESA’ has 4 vNVMe Controllers for added performance.

 

 

 

58 are assigned to the Oracle VM ‘Oracle21C-OL8-DB_ESA’ which are shown as below –

  • NVME 0:0 – 80G for OS (/)
  • NVME 0:1 – 80G for Oracle Grid Infrastructure & RDBMS binaries
  • NVME 0:2 – 100G for GRID_DG
  • NVME 0:3 – 200G for DATA_DG
  • NVME 0:4 – 200G for DATA_DG
  • NVME 0:5 – NVME 0:12 – 8 vmdk’s, each 25G – REDO_DG
  • NVME 1:0 – 1:14 – 15 vmdk’s, each 50GB – SLOB_DG
  • NVME 2:0 – 2:14 – 15 vmdk’s, each 50GB – SLOB_DG
  • NVME 3:0 – 3:14 – 15 vmdk’s, each 50GB – SLOB_DG

 

 

 

 

Oracle VM ‘Oracle21C-OL8-Customer’ vmdk’s storage policy details are as below

  • Test 1 – Test run was with ALL vmdk’s with Storage Policy ‘Oracle ESA – FTT0 – NoRAID’
  • Test 2 – Test run was with ALL vmdk’s with Storage Policy ‘Oracle ESA – FTT1 – R5’
  • Test 3 – Test run was with ALL vmdk’s with Storage Policy ‘Oracle ESA – FTT2 – R6’

 

 

 

Oracle ASM Disk Group details are as below:

grid@oracle21c-ol8-nvme:+ASM:/home/grid> asmcmd lsdg
State    Type    Rebal  Sector  Logical_Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512             512   4096  4194304    409600   117916                0          117916              0             N  DATA_DG/
MOUNTED  EXTERN  N         512             512   4096  4194304    102400   102296                0          102296              0             N  GRID_DG/
MOUNTED  EXTERN  N         512             512   4096  1048576    204800   140614                0          140614              0             N  REDO_DG/
MOUNTED  EXTERN  N         512             512   4096  1048576   2304000   124998                0          124998              0             N  SLOB_DG/
grid@oracle21c-ol8-nvme:+ASM:/home/grid>

 

ASM Diskgroup ‘SLOB_DG’ has 45 vmdk’s , each vmdk 50GB size with a total size of 2250GB (AU=1MB).

 

 

 

 

Test Use case

 

This blog is an exercise to showcase the advantages of using vSAN Max Storage for business-critical production Oracle workloads for regular use cases provided by traditional storage arrays along with providing incremental scalability, performance, and management capabilities that are associated with HCI – Simplicity with Performance, Scalability and Centralized Management.

SLOB workload generator was run against a 2 TB SLOB tablespace ( 85 datafiles , each 25GB size , total size 2125G ) with 3 different storage policies.

  • Oracle ESA – FTT0 – NoRAID
  • Oracle ESA – FTT1 – R5
  • Oracle ESA – FTT2 – R6

 

SLOB 2.5.4.0 was chosen as the load generator for this exercise with following SLOB parameters set as below:

  • UPDATE_PCT=30
  • RUN_TIME=300
  • SCALE=25G
  • WORK_UNIT=1024

 

Multiple SLOB schemas (80 schemas , 25G each , total size 2000GB ) were used to simulate a multi-schema workload model and the number of threads per schema was set to 20.

Work Unit size was set to 1024 to drive the maximum amount of IO without stressing REDO to study the performance metrics differences between the 3 different storage policies on vSAN ESA.

We ran multiple SLOB runs for the above use case and compared the Oracle metrics as shown in the table below.

Baseline Oracle Policy ‘Oracle ESA – FTT0 – NoRAID´ was created to JUST get baseline metrics for comparison purposes, besides that, no true production workload is ever setup with FTT=0 (Failures to Tolerate) and No RAID (no protection).

 

 

 

 

We can see that vSAN Max Storage Cluster New high-performance RAID 5/6 erasure coding database metrics is comparable to the baseline.

As mentioned earlier, Customers can expect RAID-5/6 to perform equal to RAID-1 with vSAN ESA. The new architecture will deliver optimal levels of resilience that are also space efficient while offering the highest levels of performance. All of this can be accomplished using RAID-6 with the assistance of the new log-structured filesystem and object format.

Diving deeper into core database load profiles, we see similar database metrics for Executes (SQL), Transactions, Read/Write IOPS and Read/Write MB/secs for both ‘Oracle ESA – FTT1 – R5’ and ‘Oracle ESA – FTT2 – R6’ storage policy runs

  • Executes (SQL) / second & Transactions / second are comparable across all 3 runs
  • Read IO requests / second & Write IO requests / second are comparable across all 3 runs
  • Read IO (MB) / second and Write IO (MB) / second are comparable across all 3 runs

In conclusion, we were able to get comparable performance metrics from overall database perspective for both ‘Oracle ESA – FTT1 – R5’ and ‘Oracle ESA – FTT2 – R6’ storage policy runs for vSAN Max Storage.

 

 

 

 

 

Summary

 

This blog is an exercise to showcase the advantages of using vSAN Max Storage for business-critical production Oracle workloads for regular use cases provided by traditional storage arrays along with providing incremental scalability, performance, and management capabilities that are associated with HCI – Simplicity with Performance, Scalability and Centralized Management.

Disclaimer – This blog is not meant to be a performance benchmarking-oriented blog in any way. The test results published below are NOT official VMware Performance Engineering test results and in no way reflect real world workload metrics.

Remember, any performance data is a result of the combination of hardware configuration, software configuration, test methodology, test tool, and workload profile used in the testing below – so the performance improvement I got with the  synthetic workload generator in my lab is in NO way representative of any real production Customer workload , which actually means the performance improvements for real production Customer workloads will be way much better.

Baseline Oracle Policy ‘Oracle ESA – FTT0 – NoRAID´ was created to JUST get baseline metrics for comparison purposes, besides that, no true production workload is ever setup with FTT=0 (Failures to Tolerate) and No RAID (no protection).

We can see that vSAN Max Storage Cluster New high-performance RAID 5/6 erasure coding database metrics is comparable to the baseline.

As mentioned earlier, Customers can expect RAID-5/6 to perform equal to RAID-1 with vSAN ESA. The new architecture will deliver optimal levels of resilience that are also space efficient while offering the highest levels of performance. All of this can be accomplished using RAID-6 with the assistance of the new log-structured filesystem and object format.

Diving deeper into core database load profiles, we see similar database metrics for Executes (SQL), Transactions, Read/Write IOPS and Read/Write MB/secs for both ‘Oracle ESA – FTT1 – R5’ and ‘Oracle ESA – FTT2 – R6’ storage policy runs

  • Executes (SQL) / second & Transactions / second are comparable across all 3 runs
  • Read IO requests / second & Write IO requests / second are comparable across all 3 runs
  • Read IO (MB) / second and Write IO (MB) / second are comparable across all 3 runs

In conclusion, we were able to get comparable performance metrics from overall database perspective for both ‘Oracle ESA – FTT1 – R5’ and ‘Oracle ESA – FTT2 – R6’ storage policy runs for vSAN Max Storage.

 

 

 

 

 

Acknowledgements

 

This blog was authored by Sudhir Balasubramanian, Senior Staff Solution Architect & Global Oracle Lead – VMware.

Thanks to Pete Koehler, vSAN Storage TMM for his feedback and review.

 

 

 

 

Conclusion

 

Storage is one of the most important aspect of any IO intensive workloads, Oracle workloads typically fit this bill and we all know how a Tier2 Storage often leads to database performance issues, irrespective of any architecture where the database is hosted on.

Enabling, sustaining, and ensuring the highest possible performance along with continued application availability is a major goal for all mission critical Oracle applications to meet the demanding business SLA’s which can be easily achieved on vSAN Max Storage offering for Business-Critical workloads.

All Oracle on VMware vSphere collaterals can be found in the url below.

Oracle on VMware Collateral – One Stop Shop
https://blogs.vmware.com/apps/2017/01/oracle-vmware-collateral-one-stop-shop.html