Home > Blogs > VMware VROOM! Blog


Oracle RAC Performance on vSphere 4.1

Oracle Real Application Clusters (RAC) is used to run critical databases with stringent performance requirements. A series of tests recently were run in the VMware performance lab to determine how an Oracle RAC database performs when running on vSphere. The test results showed that the application performed within 11 to 13 percent of physical when running in a virtualized environment.

Configuration

Two servers were used for both physical and virtual tests. Two Dell PowerEdge R710s with 2x Intel Xeon x5680 six-core processors and 96GB of RAM were connected via Fibre Channel to a NetApp FAS6030 array. The servers were dual booted between Red Hat Enterprise Linux 5.5 and vSphere ESXi 4.1. Each server was connected via three gigabit Ethernet NICs to a shared switch. One NIC was used for the public network and the other two were used for interconnect and cluster traffic.

The NetApp storage array had a total of 112 10K RPM 274GB Fibre Channel disks. Two 200GB LUNs, backed by a total of 80 disks, were used to create a data volume in Oracle ASM. Each data LUN was backed by a 40 disk RAID DP aggregate on the storage array. A 100GB log LUN was created on another volume that was backed by a 26 disk RAID DP aggregate. An additional small 2GB LUN was created to be used as the voting disk for the RAC cluster.

ServerAndLUNConfigTables 

Each VM was configured with 32GB of RAM, three VMXNET3 virtual NICs, and a PVSCSI adapter for all the LUNs used except the OS disk. In order for the VMs to be able to share disks with physical hosts, it was necessary to mount the disks as RDMs and put the virtual SCSI adapter into physical compatibility mode. Additionally, to achieve the best performance for the Oracle RAC interconnect, the VMXNET3 NICs were configured with ethernetX.intrmode =1 in the vmx file. This option is a work around for an ESX performance bug that is specific to RHEL 5.5 VMs and to extremely latency sensitive workloads. The additional configuration option is no longer needed starting with ESX 4.1u1 because the bug is fixed starting with that version.

VMConfigTable
A four node Oracle RAC cluster was created with two virtual nodes and two physical nodes. The virtual nodes were hosted on a third server when the two servers used for testing were booted to the native RHEL environment. RHEL 5.5 x64 and Oracle 11gR2 were installed on all nodes. During tests the two servers used for testing were booted either to native RHEL or ESX for the physical or virtual tests respectively. This meant that only the two virtual nodes or the two native nodes were powered on during a physical or virtual test. The diagrams below show the same test environment when setup for the two node physical or virtual test.

Physical Test Diagram:

PhyRACDiagram 

Virtual Test Diagram:

VirtRACDiagram 
Testing

The servers used in testing have a total of 12 physical cores and 24 logical threads if hyperthreading is enabled. The maximum number of vCPUs per VM supported by ESXi 4.1 is eight. This made it necessary to limit the physical server to a smaller number of cores to enable a performance comparison. Using the BIOS settings of the server, hyperthreading was disabled and the number of cores limited to two and four per socket. This resulted in four and eight core physical server configurations that were compared with VM configurations of four and eight vCPUs. Limiting the physical server configurations was only done to enable a direct performance comparison and is clearly not a good way to configure a system for performance normally.

Open source DVD Store 2.1 was used as the workload for the test.  DVD Store is an OLTP database workload that simulates customers logging on, browsing, and purchasing DVDs from an online store.  It includes database build scripts, load files, and driver programs.  For these tests, the database driver was used to directly load the database without a need to have the Web tier installed.  Using the new DVD Store 2.1 functionality, two custom-size databases of 50GB each with a 12GB SGA were created as two different instances named DS2 and DS2B.  Both instances were running on both nodes of the cluster and were accessed equally on each node.

Results

Running an equal amount of load against each instance on each node was done with both the four CPU and eight CPU test cases.  DS2 and DS2B instances spanned all nodes and were actively used on all nodes. An equal amount of threads were connected for each instance on each node.  The amount of work was scaled up with the number of processors:  twice as many DVD Store driver threads were used in the eight CPU case as compared with the four CPU case.  For example, a total of 40 threads were running against node one in the four CPU test with 20 accessing DS2 and 20 accessing DS2B.  Another 40 threads were accessing DS2 and DS2B on node two at the same time during that test.  CPU utilization of the physical hosts and VMs were above 95% in all tests.  Results are reported in terms of Orders Per Minute (OPM) and Average Response Time (RT) in milliseconds.

RAC_VirtvsNativeGraph 
In both the OPM and RT measurements, the virtual RAC performance was within 11 to 13 percent of the physical RAC performance.  In an intensive test running on Oracle RAC, the CPU, disk, and network were heavily utilized, but virtual performance was close to native performance.  This result removes a barrier from considering virtualizing one of the more performance-intensive tier-one applications in the datacenter.

 

9 thoughts on “Oracle RAC Performance on vSphere 4.1

  1. Jay Weinshenker

    Thanks for posting this – been looking forward to some RAC performance stuff for quite awhile.
    Couple of Qs though
    1) Where can I find more info on the ethernetX.intrmode =1 issue? All the links I’m turning up point to this article
    2) The DVD Store 2.1 link goes to your article on how you need to set a goal for your comparison – was that the intent?

    Reply
  2. Todd Muirhead

    A little bit more on why the intrmode setting was used. There was a bug in RHEL that was fixed in 5.5. This exposed a bug in ESX that is fixed in 4.1 update 1. Setting intrmode to 1 puts the vmxnet adpater into a legacy mode which causes it to behave like a very simple NIC. We have tested the peformance of vmxnet3 in a wide variety of scanarios and in most cases it performs better without this setting. In the case of the RAC interconnect it is extremely sensitive to latency and the slight improvement we get in latency with this setting also resulted in better RAC performance overall.
    I fixed the link for DVD Store – thanks for spotting it.
    Todd

    Reply
  3. Josh Cole

    Has Oracle added VMware to their approved solutions list? As an MSP/HSP we would love to offer this type solution to our clients however many of them have hesitated due to VMware being excluded in the past.

    Reply
  4. Thomas

    If Oracle really has support for 11g2 Rac on vmware, this will boost oracle databases in virtualization!
    We need to know exactly whats the difference between support and certification for oracle Product from oracle.
    http://www.oracle.com/…?
    Thomas

    Reply
  5. Paulo

    In Oracle, Certified configuration => Supported Configuration.
    Actually, any Oracle product running over VMware is certified, so it isn’t supported. Allegedly, Oracle will do its best effort to solve the SR, based on GOS.
    The change was that they’ll accept RAC SR over 11gR2 with the same compromise of “best effort”. This is a big change…
    Finally, the recommendation is easy. If you have an issue with Oracle and they stop the SR due VMware, open an SR to VMware including Oracle SR number, and asking to link both SR thru TSANet (http://www.tsanet.org/). TSANet will allow to talk Oracle and VMware directly, avoiding to the customer the “ping-pong” problem between two vendors. By the way, TSANet works with almost everything big vendors.
    Cheers,
    Paulo.

    Reply
  6. Iwan 'e1' Rahabok

    Thanks for sharing.
    What’s the IOPS achieved? Would be glad to have the IOPS chart, so we can see peak and sustained IOPS too.
    Is more technical detail available internally?
    Thanks!
    e1

    Reply
  7. Sam

    The point of virtualizing RAC would be to run multiple VMs on the same ESX hosts. This test does not include that. Moreover, if you indeed aren’t going to run multiple VMs on the same ESX hosts, why would you run virtual vs. physical? I understand the benefits of virtualizing, but if you’re not running multiple node VMs from the same host, than it defeats the purpose or the value.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *


*