vRealize Network Insight

Using Machine Learning to Discover Applications

Applications have always been the focus of IT. They provide value, what people use in their day-to-day; they are what truly matters. When applications go down, organizations could grind to a halt.

That is why an infrastructure department needs to keep track of the applications that are running in the environment. Which workloads (VMs, physical servers, Kubernetes instances, Public Cloud instances, etc.) make up that application so that they can monitor it. This sounds easy; just document the applications when they are deployed, right? Not always.

When an organization gets to a certain size, applications start to grow organically. Developers deploy their applications to production, mergers, and acquisitions bring in new applications, and documentation is sometimes fragmented to multiple locations. How do you keep everything organized?

Application Discovery

While vRealize Network Insight is focused on networking and security aspects of the application, it does an excellent job of converging multiple sources of application documentation. From these sources, an application boundary is established inside vRealize Network Insight, for security planning, troubleshooting, as well as exporting these application definitions to other systems like CMDBs.

Application Discovery is based on a combination of the following:

  • Workload naming conventions. When the application name is in the VM (vSphere, AWS, Azure, or VMC on AWS) name, a regular expression can pick out the right part of the name.
  • Workload tags. When there are tags on the VM (vSphere, AWS, Azure, or VMC on AWS) or Kubernetes workload name, the value of the tag can be used as the application name.
    CMDB (ServiceNow). If the application documentation is inside ServiceNow, Network Insight can pull the applications from there.
  • Security Tags & Groups. When VMware NSX is protecting the applications, the names of Security Tags and Security Groups can contain the application name. The application name can be picked out from these by a regular expression.

Discovery is an ongoing process. Each time a new application is deployed, application discovery picks it up.

Flow-Based Application Discovery

All the above application discovery methods have something in common: they need some form of metadata to be in place. With Flow-Based Application Discovery in the recently updated vRealize Network Insight Cloud, this is no longer the case.

Using the network traffic flows from either the vSphere Distributed Switch or VMware NSX, this method uses a combination of Machine Learning techniques called Disconnected Component and Outlier Detection to discover application boundaries automatically. First, the web of network communications is analyzed to identify application boundaries around a group of VMs that mostly talk more among themselves than with VMs outside the group.

It’s also possible that the defined application is used by many other applications in the communication web. If that’s the case, the application is labeled as a shared service. Like the AD and DNS services below:

Flow-Based Application Discovery with Machine Learning
Figure 1 – Flow-Based Application Discovery Process

After determining the application boundaries, the process moves on to determine the tiers within the application. Based on the network traffic, vRealize Network Insight Cloud defines tiers based on VMs that are exhibiting the same traffic and have the same network ports open.

In the example above (Figure 1), the Machine Learning algorithm first determines the Finance and Marketing applications by classifying that they have these interconnected network flows (while also use the learned shared services). After the applications are defined, the algorithm further classifies the VMs with similar network behavior into the same tier. In this case, you can see it has detected a 3-tiered application with a database, application, and web tier.

Results

All you need to get Flow-Based Application Discovery running is enable flow collection from vSphere Distributed Switch or VMware NSX. Select a Scope to run the discovery on (you can select a specific cluster, or vCenter, or any other vSphere object to limit the process). Then once discovery is complete, you simply get a list of recommended applications!

The discovery process runs continuously; the results list gets refreshed every six hours.

Flow-Based Application Discovery Results
Figure 2 – Flow-Based Application Discovery Results

From the results page, the recommended applications can be curated and saved to the system. You can filter applications with detected problems, applications that are not protected by NSX firewall rules, or applications with internet communication, and more.

You might wonder where the application name comes from. The application name is derived from the VM names; it looks for common text patterns. This is one of the reasons why only VMs are currently supported, as it needs the VM name to do so.

Lastly, take notice of the Confidence column in Figure 2. The discovery process labels the recommended application with a high, medium, or low confidence level. This is important, as there might be some non-regular, sporadic network flows that hit the app. That could lower the confidence the process has that all the VMs are captured in the recommended application. Essentially, if the confidence level is not high, double-check the application and make sure it’s complete. Any apps with a high confidence level can be relied on to be complete.

Demo

If you’d like to get a quick 4-minute overview of Flow-Based Application Discovery in the product, check out the video below!

Conclusion

Flow-Based Application Discovery is a welcome addition to the Application Discovery feature set and makes it possible to start cataloging your applications without any predefined information. Once the apps have been discovered and saved to vRealize Network Insight Cloud, you can do anything with these apps (export them), and the network and security planning and day-2 operations can start!

Try VMware vRealize Network Insight Cloud free for 30 days or learn more.