posted

2 Comments

An example on how to monitor a host system for Disk and Memory utilisation using vRealize Operations Management Pack for SNMP

 

In this example we will look into a host that is “VMware Virtual Platform”. Host has Disk and Memory information with HOST-RESOURCES-MIB and the IP of the device is 10.160.17.170.

Prerequisite

  1. Enable SNMP and configure SNMP V3 credentials on the device, using SHA Authentication Protocol and AES Privacy Protocol.
  2. The description for storage and memory is present at hrStorageDescr (1.3.6.1.2.1.25.2.3.1.3), perform SNMP walk and identify the index of RAM and Datastore for monitoring.
hrStorageDescr.1 = STRING: /vmfs/volumes/9e104ab6-ab4856cc-85bd-b33d6976c737  hrStorageDescr.2 = STRING: /vmfs/volumes/80ea0aba-1f42b539-f992-c2ccd67319e0  hrStorageDescr.3 = STRING: /vmfs/volumes/5e7a0d65-88ab0d3f-30e6-0200df29bf4  hrStorageDescr.4 = STRING: /vmfs/volumes/5e7a0d66-f85ec7f4-c3ca-0200df29bf47  hrStorageDescr.5 = STRING: /vmfs/volumes/5e7b2d73-edffa744-11b6-020001135d08  hrStorageDescr.6 = STRING: Real Memory

Let’s consider datastore-01 and RAM at index 5 and 6 respectively for monitoring the usage of storage and memory.

  • Use the formula to get the disk and RAM capacity, the formula is “Capacity = (hrStorageAllocationUnits* hrStorageSize)”.
  • Using above formula, Datastore-01 capacity is 450 GB (1048576* 428544) and Total Memory is 142 GB (1024* 138411444).
hrStorageAllocationUnits(1.3.6.1.2.1.25.2.3.1.4)  HOST-RESOURCES-MIB::hrStorageAllocationUnits.5 = INTEGER: 1048576 Bytes  HOST-RESOURCES-MIB::hrStorageAllocationUnits.6 = INTEGER: 1024 Bytes  hrStorageSize (1.3.6.1.2.1.25.2.3.1.5) OID gives the actual storage size  HOST-RESOURCES-MIB::hrStorageSize.5 = INTEGER: 428544  HOST-RESOURCES-MIB::hrStorageSize.6 = INTEGER: 138411444
  • Use the formula to get the disk and RAM usage, the formula is “Usage = (hrStorageAllocationUnits* hrStorageUsed)”.
  • Using above formula, Datastore-01 space usage is 94 GB (1048576* 88993) and Memory usage is 27 GB (1024* 26515616)
hrStorageUsed (1.3.6.1.2.1.25.2.3.1.6) OID gives the storage that is been used  HOST-RESOURCES-MIB::hrStorageUsed.5 = INTEGER: 88993  HOST-RESOURCES-MIB::hrStorageUsed.6 = INTEGER: 26515616

Let’s proceed with monitoring of Disk usage by creating symptom and attach it to alert definition. Disk usage is available with OID 1.3.6.1.2.1.25.2.3.1.6 at index 5. Monitor for Memory usage by creating symptom and attach it to alert definition. Memory usage is available with OID 1.3.6.1.2.1.25.2.3.1.6 at index 6

Step1:Create SNMP adapter instance

  1. Add SNMP Adapter in Solutions -> Other Accounts,
  2. Give the adapter a friendly name, description, provide the Start IP address an d End IP address which vROps can connect to it.
  3. Add the SNMP Port 161.
  4. Select HOST-RESOURCES-MIB from dropdown.
  5. Configure credentials using SNNP V3.
  6. Then, select the vROps collector group, which will be collecting data from SNMP device and click validate connection,
  7. Connection will be validated against the Start IP only. In our case both Start and End IPs are same.
  8. Lastly, click the Add button, and vROps will start collecting data for 10.160.17.170.

Adapter instance is created and device is discovered as an object within adapter instance where the base object type is HOST-RESOURCES-MIB.

Step2: Enable metric(s) from vROPS policies

The metrics that we are interested in are, enable all 3 metrics.

hrStorageAllocationUnits(1.3.6.1.2.1.25.2.3.1.4)  hrStorageSize (1.3.6.1.2.1.25.2.3.1.5) OID gives the actual storage size  hrStorageUsed (1.3.6.1.2.1.25.2.3.1.6) OID gives the storage usage.
  • Navigate through Administration -> Policies -> Edit Monitoring Policy -> “5. Collect Metrics and Properties”.
  • Select “Object type” as HOST-RESOURCE-MIB and filter for OID 1.3.6.1.2.1.25.2.3.1.
  • Enable the OIDs by setting “Status” to “Local” and save the changes.

Step3: Validate the metric creation and data polling

  • Validate for metric creation and data polling. Wait for couple of collection cycles to get the data. If the network latency is >10 sec it will take more time to collect the data. If the network latency is minimal then each collection will take approximately 5-10 minutes.
  • Navigate through Environment -> All Objects -> Expand SNMP Adapter -> HOST-RESOURCES-MIB and select the device 10.160.17.170.
  • Alternately User can to Administration -> Inventory -> Adapter Instance -> SNMP Adapter Instance and select the adapter instance that is previously created, from the inventory select object 10.160.17.170 and click on “show detail” icon.

All 3 metrics are created for available indexes, we are interested with data at index 5 and 6 only.

Example 1: Create symptom for Disk usage and generate alert

Let’s create Alert with criticality as Info, if the disk usage exceeds 20% and trigger alert.

Navigate through Alerts -> Configuration -> Alert Definition -> ADD

Alert Name: Disk Utilisation is more than 20% of allocated size

Base Object Type: HOST-RESOURCES-MIB

Create symptom for component at Index 5 of OID 1.3.6.1.2.1.25.2.3.1.6 (hrStorageUsed) and add symptom to the alert definition.

  • Symptom Name: Storage usage of Disk at index 5 is more than 20%
  • OID: 1.3.6.1.2.1.25.2.3.1.6
  • Index: 5
  • Symptom Condition: Is Greater Than 88000

Applied alert to the policy where metric is enabled, and created the alert.

Symptom condition is matched and alert is generated after couple of collection cycles.

Example 2: Create symptom for Memory usage and generate alert

Create Alert with criticality as Warning, where Memory available is < 75% of total.

Create a new Alert by navigating through through Alerts -> Configuration -> Alert Definition -> ADD

Alert Name: Memory available is less than 75%

Base Object Type: HOST-RESOURCES-MIB

Create Symptom for component at Index 6 of OID 1.3.6.1.2.1.25.2.3.1.6 (hrStorageUsed)

Symptom Name: Memory availability at index 6 is less than 75% of total

OID: 1.3.6.1.2.1.25.2.3.1.6

Index: 6

Symptom Condition: Is Greater Than 26000000.

Added the symptom to alert definition and reduced wait cycle to one collection.

Applied alert to the policy where metric is enabled, and created the alert.

Symptom condition is matched and alert is generated after couple of collection cycles.