VMware Advanced Monitoring for Horizon Powered by ControlUp
In a previous blog, I took a look at ControlUp and the ways VMware customers can purchase ControlUp licenses—to monitor Horizon environments—as add-ons to their Horizon licenses, directly from VMware.
With ControlUp, Horizon customers can monitor their EUC environments in real time and proactively track the experiences of their end users. ControlUp’s guided troubleshooting and built-in actions can quickly remediate problems within their environment. A great start.
But I wanted to dig a little deeper.
Yes, ControlUp has a comprehensive dashboard for real-time monitoring, but it also provides a powerful framework that can be used not just to gather information, but to proactively avoid or reactively remediate issues not only with the desktops, but also with other technologies, such as App Volumes, UAG, Blast, and thin clients. Much of ControlUp’s extensibility is made possible through script actions that can be downloaded from their script repository or—because ControlUp has such a strong focus on its user community—created and uploaded to the script library by its users.
I wanted to check out the variety of Horizon-related scripts that were available, so I brought up my ControlUp console and entered Horizon in the search box; I found at least two dozen of them. The scripts’ functions ranged from performing fairly mundane tasks, such as logging off a Horizon user to more involved analyses, like examining the logon duration for any given user.
To get a better feel for the extensibility of ControlUp, I decided to dive deeper into a few of these scripts.
Consider VMware Unified Access Gateway (UAG). UAG is used with VMware Horizon, Workspace ONE Access, and Workspace ONE UEM to provide secure external access to an organization’s applications, so it’s critical that the UAG(s) are operational and running without any issues. ControlUp’s Get Horizon UAG Health script pulls all health information for the UAGs connected to the pods in a Cloud Pod Architecture (CPA) or the local ones if CPA hasn’t been initialized.
Now that we know that UAG is healthy, let’s take a look at App Volumes, one of the major components of the VMware Just in Time (JIT) composable desktop strategy. If App Volumes has an issue, it must be detected and remediated quickly. ControlUp’s Health Check App Volumes End-Point script action takes care of that in just a couple of clicks. This script reports issues that are impacting users, as well as showing disk mounts and durations for App Volumes mounted in user sessions.
App Volumes does a great job of writing issues to event logs, but combing through these logs manually is, at best, time-consuming and the possibility of missing an important issue is high. ControlUp automates this process and produces and displays a report of user logon stages and the durations of those phases.
There is a caveat, however. When running this script, it’s important to know that some of the data written to the logs is written asynchronously and might not appear for several minutes after a user’s logon has completed.
There are any number of issues that can affect the health of AppVolumes; ControlUp’s Health Check App Volumes Endpoint checks more than a dozen of these. The output provides vast amounts of information, including which version of App Volumes was running, the App Volumes server, the number of App Volumes sessions, and that a test connection to the App Volumes server was successful. It also features a warning that restart recovery for App Volumes was not set.
Once I confirmed that my App Volumes installation was healthy, I wanted to go deeper and make sure that App Volumes was performing well with my user’s desktops.
Logon duration is among the things that cause IT admins to lose sleep. So, it’s not insignificant that one of ControlUp’s most popular scripts is Analyze Logon Duration, which has recently been updated to include App Volumes information and now shows how long App Volumes take to instantiate in your environment. This information allows you to detect and investigate whether any of your App Volumes are having issues and causing slow login times for users. It can also eliminate App Volumes as a cause when investigating slow login times, so you can concentrate on finding the root cause.
To see the Analyze Logon Duration script in action, I used VMware’s TestDrive. TestDrive is a sandbox environment setup by VMware where our partners and customers can work with and take products out for a spin. You can talk to your VMware or VMware partner account executive or system engineer to get access to it.
Following the instructions in the ControlUp Advanced Monitoring for Horizon—App Volumes Health Check for End Points TestDrive walkthrough, I logged into TestDrive, logged on to a Horizon client, then launched the ControlUp Console.
I ran the script against my desktop; it showed how long each one of the App Volumes took to start. From the output, I could see that Epic 2014 (a popular healthcare application), WinDg (a Windows debugger), and Fiddler (a web debugger) were being mounted as App Volumes. For each of these applications, I could see the pre-start, logon, postsvc, and shell start time for them.
Horizon + ControlUp: It’s a Blast
Next, I wanted to see how ControlUp worked with VMware Blast Extreme. Blast is VMware’s remote display protocol of choice and can have a profound effect on the overall experience that an end user has with their virtual desktop. I found a couple scripts that looked interesting: Analyze VMware Blast Session, which provides the statistics for a Blast session, and the Reduce Session Bandwidth Consumption script, which reduces the maximum frames-per-second (FPS) sent. Limiting how often a screen is refreshed can save bandwidth, but it can also cause unacceptable video jitter and lag in applications.
To test these two scripts, I connected to a local Horizon desktop. I ran a 1720 x 720 video on it to generate a load on Blast, then ran the Analyze VMware Blast Session script. It reported, among other things, that my RTT was 1ms and that it was refreshing at 22 frames per second; the ControlUp Dashboard showed that it was consuming ~6 Mbps. The video on the desktop played smoothly and the applications were responsive.
Once I had a baseline of the FPS and the feeling for the responsiveness of the desktop, I used the Reduce Session Bandwidth Consumption script to set the FPS to 5. By doing this, it reduced my bandwidth usage in half (~2 Mbps). The video was extremely jittery and there was a slight lag when working with documents, but in certain cases it is desirable to sacrifice smoothness to conserve bandwidth. Horizon does have a group policy object (GPO) that allows you to change Blast parameters permanently, but for one-off testing and emergency situations, Reduce Session Bandwidth Consumption definitely comes in handy.
Monitoring Horizon Sessions, End-to-End
ControlUp can monitor a Horizon session from end-to-end—literally from a thin client to the processes running in a virtual desktop. The platform’s dashboard can monitor Windows 10 IoT thin clients, as well as those from another of VMware partners, IGEL.
By right-clicking on an IGEL thin client in the ControlUp console, you can see script actions that are relevant to that device. These allow you to quickly pinpoint and isolate issues that your users may be having with IGEL thin clients or to see if other issues in your environment are impacting the thin clients. For example, if a user is having an issue, you can use the Shadow terminal script to take a look at what they are seeing on the IGEL device. There are scripts to wake it up, see the machine details, update its configuration, and even perform a graceful reboot of an IGEL device directly from the ControlUp Console.
VMware + ControlUp: A Script for Success
VMware Horizon has proven to be extremely reliable and performant over the past decade, but when issues do come up with it, with one of its features, or with the devices users are using to connect to it, they must be resolved quickly. With ControlUp, you can monitor your entire VDI environment from a single dashboard and when problems arise, you can investigate and solve them from that same dashboard. When script actions are used in conjunction with ControlUp’s triggers, you can start on your journey to having a self-healing environment.
You can take a look at the full range of ControlUp script actions in their Script Library. If you’d like to learn about creating your own scripts, they have a thorough blog post that walks you through everything involved you need to get started.