vRealize Suite

vRealize Management: Automated Application Monitoring

If you have both vRealize Automation and vRealize Operations, you have probably wanted to automate the installation of the Telegraf agent used by vRealize Operations for application monitoring. Since agent lifecycle management was limited to the user interface, this was not possible … until vRealize Operations 8.1, that is.

With new APIs for lifecycle management of the Telegraf agent used by vRealize Operations Application Monitoring you can now automate the installation of the agent on new deployments, configure plugins and perform maintenance and upgrades on the agents. Let’s explore these new APIs and then look at an example I have created to leverage Event Broker in vRealize Automation to install agents as part of a blueprint deployment.

In this blog I will also leverage the new multi-language feature of vRealize Orchestrator 8.1 by using Python scripts in the workflow actions. It is nice to be able to create workflows in either Python, PowerShell or node.js depending on my needs and I was excited to leverage this new feature for this project.

Also, to keep the blog as short and interesting as possible I am not providing some of the more basic details on workflow creation vRealize Orchestrator, application monitoring in vRealize Operations or using Event Broker in vRealize Automation. I will only cover the particular details for my project.

APIs for Application Monitoring

New APIs for vRealize Operations 8.1 for Application Monitoring provide everything you need for agent lifecycle management. You can find the new APIs documented in the usual place on any vRealize Operations virtual appliance. Just browse to your appliance at https://<applianceIp>/suite-api to view them. The new APIs are found under the /api/applications endpoint.

As you can see, they are comprehensive. Any action you can perform in the UI is provided through these APIs. That includes the administrative tasks of enabling Application Monitoring and adding a vCenter to the application monitoring adapter.

In fact, these APIs also support the new physical OS monitoring feature that is available today in vRealize Operations Cloud (coming to on-premises in a future release). As a side note, this adds the capability to monitor a non-VMware hosted Windows or Linux OS, be it installed on bare-metal or some other hypervisor.

Installing the Agent

Instead of drilling into each and every API, I will walk through the APIs you would need to use for agent installation to help you get started.

As you know, to install the agent we need account credentials for the guest OS. Those requirements are well documented, so be sure to reference them. In addition, you will need to know the resourceId of the VM object in vRealize Operations. The workflow looks like this:

    • Authenticate to vRealize Operations
    • Find the resourceId of the VM object
    • Initiate agent installation using the OS credentials
    • Verify agent installation

The APIs used in my example are available in my Postman collection for vRealize Operations. Specifically, I’m going to use the following APIs:

    • acquireToken (POST /api/auth/token/acquire)
      • We need this to authenticate to vRealize Operations for the following requests
    • getMatchingResources (POST /api/resources/query)
      • This will find the resource by looking for the VM UUID. I prefer this to searching by name as it is a unique identifier and I should only get back one resource. This provides the resourceId I will need in the next request.
    • installAgent (POST /api/applications/agents)
      • Kicks off the install and returns a taskId that we can check if we want to confirm the install was completed.

Now let me show you how this looks in vRealize Orchestrator.

Python Actions for Agent Installation

In vRealize Orchestrator 8.1, we introduced the ability to use languages other than Javascript for writing workflow scripts and actions (Python, Powershell and NodeJS are supported). I tend to use mostly Python these days, so this was a welcomed addition to make life easier and consistent. This feature allows you to import your script as an action in vRealize Orchestrator.

For this example, that means three scripts. Let me cover each of them in turn beginning with the acquireToken API request.

Every REST call to vRealize Operations will require an authentication and this is accomplished using an OAuth 2.0 bearer token. As a side note, vRealize Operations 8.1 and higher no longer support basic authentication.

This action requires five inputs.

  • username: A vRealize Operations account user name.
  • password: Of course!
  • authSource: The authentication source for this user. For example, the “admin” user is a “local” user. For an AD user, this would be the name of the AD authSource you created in the Administration > Access > Authentication Sources screen.
  • vropsHost: The IP or FQDN of your vROps node
  • verify: By default, the SSL certificate will be validated and if it cannot be validated then the REST request will fail. This can be a problem in labs where you may have self-signed certs. In that case, add this input and set it to False.

The action returns the bearer token. Next, we need the resourceId of the VM object.

Here, I am using the UUID of the virtual machine from vCenter to find the related resource in vRealize Operations. Inputs required are:

  • vmUUID: Where you get this depends on how you are calling this script. In my example, you will see how I am using the payload from Event Broker, where you can find it in the schema under “externalIds.”
  • token: For authentication, from the first action.
  • verify: As discussed previously.

The output is the resourceId and you will need this in the final action. Now we have everything needed to kick off the agent install. Note that if the request failed, i.e. a resource was not found, the resourceId will be empty. You will see why later in my example in the full workflow.

Next up is the agent installation, or “bootstrap” process. This requires guest OS credentials to push and run an initial script from the Application Remote Collector in vRealize Operations.

Inputs required are:

  • resourceId: From the previous action.
  • user: This is the guest OS username for the bootstrap install.
  • passwd: The user’s password, which is stored as a secureString in vRealize Orchestrator. More on this later.
  • token: The vRealize Operations bearer token from the first action.
  • vropsHost: The IP or FQDN of the vRealize Operations node.
  • verify: As discussed previously.

The output is an object with the response code and the response payload. You can use these in the workflow to verify the request was successful and to grab the taskId to poll the status of the installation.

Finally, the action to check the status of the installation. The previous request only initiates the installation. For validation the workflow will continue to monitor the installation status until it is successful (or fails, in which case an exception is thrown).

Inputs required are:

  • vropsHost
  • token
  • taskId: From the previous call to bootstrap the virtual machine.
  • verify

Now let’s see how this comes together in vRealize Orchestrator.

Workflow Schema

Here’s the workflow schema with the four actions added. To learn how to add the Python scripts into vRealize Orchestrator, check out the documentation as well as this blog.

The workflow including the four actions mentioned in this blog post.

The workflow expects one input from vRealize Automation, which is the inputProperties provided by Event Broker. From there, the workflow will parse the properties for the virtual machine’s vCenter UUID (from the “externalIds” array as discussed earlier).

Getting the UUID of the virtual machine from the inputProperties.

There are a couple of branches in the workflow that handle delays in completing or starting tasks. For example, this branch will initiate a 5-minute delay and retry if the virtual machine resource ID is not found in vRealize Operations. This is due to an expected delay between when the virtual machine is provisioned on a vCenter and when vRealize Operations discovers it. Since I don’t want this to iterate forever (for example, the deployment may fail and the virtual machine might be deleted) I have a counter set to check no more than 2 times. So, this loop will only delay the workflow for a maximum of 10 minutes.Delay branch for the action to look up the vRealize Operations resource ID.

The other delay branch simply watches the state of the bootstrap process and checks every 5 minutes until a “FINISHED” status is returned (or an error status, in which case an exception is thrown). Here I have used the Switch element in vRealize Orchestrator to check the status and route the workflow appropriately.

Using the switch element in vRealize Orchestrator to monitor the state of the bootstrap and route the workflow appropriately.

All this takes some time, with the built-in delays. The event subscription is non-blocking so it will not slow down the deployment in vRealize Automation.

For action inputs such as the vRealize Operations FQDN and login credentials for vRealize Operations and the guest OS, I have created a Configuration Element in vRealize Orchestrator to store these securely. In your environment, you will need to update these values with the specific information you require.

Configuration element holds values particular to your environment for reference by your workflows.

How do you know the OS credentials ahead of time? That is up to you, but in my case, I decided to use an account that already exists on the image. You could create the account during deployment I suppose, but I found this to be the most secure way.

The Event Subscription

Now that the workflow is ready, all that is needed is an Event Broker subscription in vRealize Automation.

The event subscription in vRealize Automation, showing the Event Topic.

The event topic I use will execute the workflow after the virtual machine is created. The schema also provides the UUID of the virtual machine from vCenter so I can look up the related resource in vRealize Operations in the workflow.

Filtering the subscription based on a vRealize Automation blueprint custom property.

I will filter events, because I do not want the agent installed on every machine deployed. In the blueprint, I have added a custom property to indicate that the agent should be installed on a virtual machine.

A vRealize Automation blueprint with a custom property.

vRealize Management Working Together

I hope this inspires you to create your own vRealize Automation Event Broker subscriptions and vRealize Orchestrator workflows to leverage vRealize Operations APIs for Application Management. Other interesting use cases including Day 2 actions for agent management or configuration of agent plugins. Below you can see the result of this project, a deployment with OS monitoring as displayed in vRealize Operations.

vRealize Operations can monitor OS and application services on virtual machines.

If you would like to download and try this workflow, you can grab it here on Sample Exchange. This was built using vRealize Management products at version 8.1 and will not work with lower versions of any of these products.

To learn more about vRealize Management, visit our site today.



3 comments have been added so far

  1. Is it possible to change the action to return the resourceId of the vm on this line from:
    r = requests.request(‘GET’,’https://vrops-fielddemo.cmbu.local/suite-api/api/resources’,params=payload, headers=headers, verify=inputs[“verify”])
    print (r.text)

    To this to match the other actions and republish the files? The action has this URL hard coded when using the action as zip inport. If changing to script type in the actiion and changing the URL, the workflow fails. I made sure to set the verify input value to false since it is a lab environment.

    r = requests.request(‘GET’,’https://’+inputs[“vropsHost”]+’/suite-api/api/resources’,params=payload, headers=headers, verify=inputs[“verify”])
    print (r.text)

    See error output:
    2020-07-01 15:53:41.000 -07:00ERROR InsecureRequestWarning,
    2020-07-01 15:53:42.000 -07:00INFO__item_stack:/item21
    2020-07-01 15:53:42.000 -07:00ERRORCannot find module handler
    2020-07-01 15:53:42.000 -07:00ERRORError in (Workflow:Install_Telegraf_Agent / Get Resource ID (item21)#12758) Wrapped ch.dunes.scripting.server.polyglot.PolyglotRunner$PolyglotRunnerException: Function execution returned code: 1
    2020-07-01 15:53:42.000 -07:00ERRORWorkflow execution stack:
    item: ‘Install_Telegraf_Agent/item21’, state: ‘failed’, business state: ‘null’, exception: ‘Wrapped ch.dunes.scripting.server.polyglot.PolyglotRunner$PolyglotRunnerException: Function execution returned code: 1 (Workflow:Install_Telegraf_Agent / Get Resource ID (item21)

Leave a Reply

Your email address will not be published.