vCenter Operations vRealize Operations

Monitoring RAC using Blue Medora’s vCOPs Management Pack for Oracle EM

Relisted from a Blue Medora blog.

Blue Medora’s 3.4 release of the vC Ops Management Pack for Oracle EM has a new feature designed to make creating dashboards for Oracle RAC clusters easier and better.

When looking at a RAC cluster, you might want to aggregate performance metrics about the databases it it.  So far so good: we can write a super metric that does that, because the databases are direct children of the RAC cluster.  But, what if we want to look at performance metrics about the VMs that the databases reside on?  Now, we have a problem, because the VMs show up as grandparents of the databases, so there’s no direct relationship between the VMs and the RAC cluster.

 

 

Enter Blue Medora’s 3.4 release of the vC Ops Management Pack for Oracle EM.  Whenever the management pack detects that you are monitoring RAC clusters, it creates a container resource for each RAC cluster.  This container contains the RAC cluster resource itself, plus all VMs that the databases reside on, plus all ESXi Hosts that the VMs reside on.  Now, by attaching super metrics to the container resource, we can combine performance data from VMs, RAC clusters, Oracle Databases, and ESXi Hosts.

 

 

Let’s look at an example dashboard, and see how it was created.

 On the left is a list of all the RAC container resources (we only have one right RAC cluster right now, called TESTDB).  Each container resource includes the name of the RAC cluster in its name.  Selecting the RAC container resource populates the scoreboard on the right, which is composed entirely of super metrics, and the scoreboard drives the distribution analysis widget below it, so that clicking any metric in the scoreboard shows the distribution of values from the last 24 hours.

To recreate this, we’ll first start by creating some super metrics.  Here are a couple of examples:

Highest DB Memory Delta above average (MB):

maxN(OEM – Database Instance: Memory Usage|Total Memory Usage (MB),2)-avgN(OEM – Database Instance: Memory Usage|Total Memory Usage (MB),2)

This takes the max value of Total Memory Usage metrics from all the database instances that are two levels below the resource the supermetric is attached to, and subtracts from that the average Total Memory Usage metrics.  This gives the highest delta above the average memory usage.

IOPS remaining:

sum(Virtual Machine: Disk|I/O Usage Capacity)-sum(Virtual Machine: Disk|Write Requests)-sum(Virtual Machine: Disk|Read Requests)

This super metric is something we could not previously do, which is get the total number of IOPS remaining on the VMs that are hosting the RAC database instances.

Here are the super metrics that we used in the above dashboard, without commentary:

CPU Usage: sum(Virtual Machine: CPU Usage|Usage (%))

Memory Usage (%): avg(Virtual Machine: Memory|Usage (%))

Storage Usage (%): avg(Virtual Machine: Guest File System |Guest File System (%))

Storage Usage (IOPS): sum(Virtual Machine: Disk|Reads per second)+sum(Virtual Machine: Disk|Writes per second)

CPU Remaining (GHz): (sum(Virtual Machine: CPU Usage|Provisioned Capacity (MHz))-sum(Virtual Machine: CPU Usage|Usage (MHz)))/1024

Memory Remaining (GB): (sum(Virtual Machine: Memory|Usage (KB))/(avg(Virtual Machine: Memory|Usage (%))/100)-sum(Virtual Machine: Memory|Usage (KB)))/(1024*1024)

Storage Remaining (GB): sum(Virtual Machine: Guest File System |Guest File System Free (GB))

Storage Remaining (IOPS): sum(Virtual Machine: Disk|I/O Usage Capacity)-sum(Virtual Machine: Disk|Write Requests)-sum(Virtual Machine: Disk|Read Requests)

 

Once the super metrics are set up, add them to a super metric package, and attach the package to the RAC container resources.

(For more information about creating super metrics, this is a good introduction: https://www.batchworks.de/using-super-metrics-to-monitor-cpu-ready-part1/ and https://www.batchworks.de/using-super-metrics-to-monitor-cpu-ready-part2/)

The next step is to create an xml file listing what metrics should display on the scoreboard (and some other widgets as well).  These are located in

$ALIVE_BASE/tomcat-enterprise/webapps/vcops-custom/WEB-INF/classes/resources/reskndmetrics/

Here is what the file looks like on our system.  We called it ‘racContainerScoreboard.xml’

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”yes”?>

<AdapterKinds>

<AdapterKind adapterKindKey=”OEM_ADAPTER”>

<ResourceKind resourceKindKey=”OEM Entity Status”>

<Metric attrkey=”Super Metric|sm_29″ label=”CPU Usage” unit=“%” yellow=”80.0″ orange=”90.0″ red=”95.0″/>

<Metric attrkey=”Super Metric|sm_31″ label=”Memory Usage” unit=”%” yellow=”80.0″ orange=”90.0″ red=”95.0″/>

<Metric attrkey=”Super Metric|sm_32″ label=”Storage Usage” unit=”%” yellow=”80.0″ orange=”90.0″ red=”95.0″/>

<Metric attrkey=”Super Metric|sm_33″ label=”Storage Usage” unit=”IOPS” yellow=”-1.0″ orange=”-2.0″ red=”-3.0″/>

<Metric attrkey=”Super Metric|sm_30″ label=”CPU Remaining” unit=”GHz” yellow=”-1.0″ orange=”-2.0″ red=”-3.0″/>

<Metric attrkey=”Super Metric|sm_36″ label=”Memory Remaining” unit=”GB” yellow=”5.0″ orange=”2.0″ red=”1.0″/>

<Metric attrkey=”Super Metric|sm_35″ label=”Storage Remaining” unit=”GB” yellow=”20.0″ orange=”10.0″ red=”5.0″/>

<Metric attrkey=”Super Metric|sm_37″ label=”Storage Remaining” unit=”IOPS” yellow=”20000.0″ orange=”10000.0″ red=”5000.0″/>

</ResourceKind>

</AdapterKind>

</AdapterKinds>

Note the format of the super metric attribute keys.  The super metric attribute key is a sequential id preceded by ‘sm_’. Each time a new super metric is created, the id increments.  This means that keys are system specific! For this reason, while we can include super metrics in our Management Packs we cannot ship any dashboards that contain super metrics.  If you are following along, you will have to change the attrkey in the xml file to reflect the correct supermetric.  In the Super Metrics… dialog, the ID column will give you the id for each super metric.

Finally, create the dashboard.  Here, we used a Resources widget, a Scoreboard widget, and a Data Distribution Analysis widget.  We have an interaction from the Resources widget to the Scoreboard widget, which has an interaction to the Data Distribution Analysis widget.

Set the Resources widget to display resource kinds of ‘OEM Entity Status’ (which is the type of the RAC Containers).

 Set the scoreboard Res. Interaction Mode to the xml file we created, and self provider to be off.

Finally, configure the Data Distribution Analysis in whatever way makes sense to you.

And you are done.  You can now use this process to invent new supermetrics and dashboards to show you exactly what you want to monitor in your environment.