Product Updates Migration

Avoid Costly Surprises with AI/ML Based Anomaly Detection

We are excited to announce the general availability of CloudHealth Anomaly Detection!  

In our prior blog announcing the public beta of this feature, we went over how anomaly detection can help you take control of your cloud costs with visibility into unusual or abnormal spend by analyzing cost spend patterns and utilizing historical data aligned with industry trends.  

Today we want to take a step back so you can fully understand this feature and what it can help you accomplish. 

What is Anomaly Detection?  

Anomaly Detection originated with the intent to give better control to cloud owners on the sudden changes resulting in unwanted spends. With the dynamic nature of the cloud, customers are dealing with explosive changes – frequent enablement of new services & regions, usage impacts due to multiple stakeholders utilizing cloud, price changes & services up-scaling rapidly. It is becoming more complex and difficult to keep up with spend and usage and hence users are demanding more information & better control. Real-time spend and usage anomaly detection is no longer a desired feature but a must have for all cloud portfolios. The solution must work not just for a single cloud, but cater to a multi-cloud portfolio as desired by organizations. 

In simple words – anomaly is an unusual behavior and anomaly detection intends to catch that behavior as soon as it occurs with help of smart machine learning algorithms to help avoid costly surprises on cloud. 

What can CloudHealth Anomaly Detection do? 

Anomaly Detection utilizes a suite of machine learning algorithms on vast datasets to detect near-real time anomalies – from detecting smallest of changes, identifying outliers, interpreting trends to including periodicity, seasonality based on historical data & industry standards.  

At CloudHealth, we listen to the customers and want them to use the features with ease. That’s the reason UI for anomaly detection supports multiple yet significant filters such as service, region, account for a user to focus on specific cloud areas and also provides sorting & searching based on dollar value change & percentage cost impact making it easy to narrow down the search. A user can also create policies based on their preferred filters and be notified only of the anomalous spends they want to have control on. One can even download the reports and track them over a period of time.  

Often It’s not enough to just detect & identify the anomaly but it’s important to reach the root cause to actually remediate the anomaly. And that’s where CloudHealth provides users with the ability to understand the anomaly further by assisting with data such as usage impact, resource ID, usage type in conjunction with FlexReports. 

While important anomalous spend gets identified, it’s expected that false positives will also be present. While we aim to reach 99% accuracy, currently our anomaly detection system terms these identified false alarms or noise as archived ensuring user doesn’t spend time on the noise but at the same time has the right references available maintaining transparency. 

Customer Testimonials:

  • “I think overall the Anomaly Detection feature is a great add to the platform. It does everything I expect it to. I like the ability to filter by cost impact. The graphs are nice when clicking into an anomaly and I like the ability to toggle on “show other anomalies”.  
  • “Love the use of FlexReports here, would be a great addition help to do root cause analysis” 
  • “Really like the ease of use and the identified anomalies do give a better understanding of cloud portfolio” 
  • “Love Anomaly Detection, extending this to all customer tenants utilizing policies to notify and provide additional info when anomalies occur” 

Customer Use Case:

A customer had been observing increase in cloud cost due to unusual spends across the cloud portfolio. A dedicated team constantly tracked & worked on identifying those spends, narrow down with the help of all third party and cloud native tools, and then come up with an analysis suggesting where all does the anomalous spend comes from. Then they further try to investigate what made this unusual spend happen and by the time they reached the actual reason the anomaly is no longer valid or is now inactive. The effort invested is huge and the result achieved although substantial is delayed. With Anomaly Detection feature all anomalous spends were identified in near real time removing the delay, details of all impacted services became available giving a birds’ eye view, history of past instances was visible to assist for future planning and the datapoints such as usage type, resource ID were available to actually go and take remedial actions. Anomaly detection is not only making it easy to identify unwanted spends rather resulting in reducing the indirect costs of running cloud.  

This is just a start for CloudHealth Anomaly Detection. At CloudHealth we intend to make this journey easier, even faster and ensure there are no costly surprises while running the cloud.