posted

0 Comments

About three years ago I posted a blog discussing the performance of the vRealize Operations REST APIs. That blog post still gets referenced, but I thought it was time to update since there have been overall platform performance improvements with each release and the original blog doesn’t address a specific question that I keep getting from customers.

Arguably the most common REST call made by customers is to export metric data.

The most popular use case, in my opinion because a scientific poll was not conducted, for the REST APIs is exporting metric data from vRealize Operations for use with other monitoring systems, service desk solutions, home-grown analytics and reporting and other various reasons. Because this type of use typically involves a high number of frequent requests and a large amount of response data, there are legitimate concerns about the performance of such requests as well as the impact on the vRealize Operations cluster. Additionally, as I mentioned in opening, a question of scale has started to come up from some customers who want to know how many concurrent requests can be handled (and if those requests are throttled by default).

Performance Tests of the REST APIs

As before, I reached out to our engineering team to understand how they measure performance of the APIs with each release. I also asked the team to provide insight on the question of concurrent clients. Let’s start with overall performance numbers as of the 8.0 release of vRealize Operations.

First, the test bed is an HA cluster with four medium nodes monitoring around 8550 objects (of which 8000 are VMs). Total metric payload is 3 million configured and 1.3 million collecting.

Engineering used GET /api/resources/stats/query with the request payload specifying the virtual machine metric ‘cpu|demandmhz’ for 1000 virtual machines. Since 1000 is the default pageSize value, this provides a nice unit of measure and is the pageSize is typically left at default for most API calls. Here are the results:

 

Time Period (begin/end window) Total Response Time in seconds (uncompressed) Total Response Time in seconds (compressed)
5 minutes 10 8
1 day 12 9.5
1 month 100 26
1 year 226 50

 

The times are client-centric and include processing time for vRealize Operations as well as the response data download time.

Something to note, is that a major improvement was added to the APIs which allowed for compression of the response payload and as you can see it has a significant impact on the overall response time. This capability was added in the 7.5 release and you can read more at this link.

What About Concurrency?

Engineering is still validating maximum concurrent REST calls per node and per cluster. I will be sure to update this blog or post another when that information is made public. However, you should know that for now, there is a configured throttle of 300 concurrent requests with an additional throttle of 50 concurrent requests from a single client (defined as an IP address). With that information, you at least know at which point your requests may begin to be throttled by the cluster.

Thanks for Ashot Avagyan and Arshak Galstyan for help in writing this blog and providing the technical information.