Automate and Scale Public Cloud Security with an Effective Risk Mitigation Strategy

Formulate an effective public cloud risk mitigation strategy and scale security using CloudHealth Secure State.

As of today, CloudHealth Secure State secures over 62 million public cloud assets and has detected more than 29 million security risks due to cloud misconfigurations. To reduce security risks, we provide intelligence on risk posture within organizations with near real-time detection of resource misconfigurations. But that’s not all–once you start seeing the security violations in your cloud environment, CloudHealth Secure State will also help you prioritize risks and remediate security findings. Having this capability has become especially important with a rapid increase in public cloud adoption leading to organizations having hundreds and thousands of cloud accounts, with thousands to millions in open findings. In this context, we want to address the question of how to efficiently scale security with the least time-to-remediate.

Where to begin?

A good starting point is creating a risk prioritization framework aligned to your organization’s needs, for which CloudHealth Secure State provides a rich toolkit.

You can prioritize your findings based on any of these criteria:

Risk score – CloudHealth Secure State provides you with a contextual risk score between 10 and 99, calculated using a combination of qualitative risk (low, medium, or high) and other resources affected by specific findings. CloudHealth Secure State provides this score for a finding based on risk associated with connected resources in your cloud environment.
Projects, applications, or tags – Your risk appetite may vary based on projects, wherein each project is a discrete collection of cloud accounts within the organization with separate authorization. Additionally, some applications may handle more sensitive information (e.g., financial information, Personal Identifiable Information (PII) etc.) and based on the risk sensitivity of your application you can add tags to your resources. You can then prioritize findings based on these cloud tags.
Exposure of the application – If your application is currently in dev stage, you can assign it a lower priority when compared to a production application.

Let us look at an example of cloud risk prioritization framework (a standard prioritization framework adapted from Watson, Clifford, “Risk Assessment Using The Three Dimensions Of Probability (Likelihood) Severity, And Level Of Control”). In this example, rules are classified in context of likelihood of occurrence and impact on business.

blog-cloudhealth-secure-state-risk-prioritization-framework-image

High impact on business and high likelihood of occurrence – Restrict public access to S3 buckets with projects that handle sensitive information such as PII data or an EC2 instance that has unrestricted protocol access and is exposed to the internet.

Low impact on business and high likelihood of occurrence – Restrict public access to S3 buckets with projects that handle no sensitive information. Even if the bucket contents are compromised, the impact on business is low. 

High impact on business and low likelihood of occurrence – The use of IAM root user or access key is not recommended for day-to-day administrative tasks. A root user creating an access key is a task with a low probability of occurrence, but if it happens and then gets compromised, it has a high impact on business.

Low impact on business and low likelihood of occurrence – Systems manager managed instance should be in compliant status. Since the organization in this example is not using Systems manager managed instance, it is considered as low to no likelihood of occurrence.

Similarly, you can create your own risk prioritization framework which caters to the risk appetite, policies and compliance requirements specific to your organization.

And once you have a risk prioritization framework in place, the next step is to apply rules and findings to that framework. When you look at organization wide findings, there are typically two categories:

1.) Findings you would like to ignore

Let’s first take out the noise. There may be different reasons to ignore a finding such as, the rule doesn’t apply to your organization’s security posture analysis, or for some reason the development team requires a current setup- which violates some organization-level security policy. These belong to the low impact on business and low likelihood of occurrence quadrant.

 To create a workflow for ignoring a rule, you have the following options:

Understand the risk and accept it. Disable the rule. If this rule is an exception for your entire organization, then you have an option to disable the rule checks and CloudHealth Secure State stops detecting findings against this rule.

Suppress the rule. If the rule exception is only to a project or cloud account, then:

You can create a suppression policy in the context of an organization or project.
As a developer, if you would like to request your organization’s security admins to ignore a finding, you can do so by requesting a suppression on a specific finding. By doing so, your organization’s security admins will approve the suppression request and accept the risk for a requested duration.

blog-cloudhealth-secure-state-dash2

blog-cloudhealth-secure-state-dash3

2.) Findings you would like to remediate

Once you have eliminated the noise, you can now proceed with remediation of the findings in the remaining 3 quadrants of the risk prioritization framework.

	Alert	Remediate by Operation Team	Create Auto-Remediation
High impact on business and high likelihood of occurrence	Critical Alert	Within 24 hours	Yes – Set up auto-remediation
Low impact on business and high likelihood of occurrence	Medium Alert	Within 10 business days	Optional
High impact on business and low likelihood of occurrence	High Alert	Within 2 Business days	Recommended

This is an example of creating and using a framework to manage risk. As organizations vary in their risk tolerance, they can customize the risk prioritization framework accordingly. And, once there is a prioritization framework in place, they can progress to acting on those findings based on their risk appetite.

To remediate detected findings, you can create a workflow using the tools provided by CloudHealth Secure State, allowing you to achieve the least MTTR (Mean-Time-To-Remediate). Let us look at some of these tools:

Notify the responsible team
Set up an automated alerting system based on your prioritization framework to ensure the right team gets notified to fix outstanding configuration issues. You can set up different channels based on priority tasks and direct the teams to act based on the set SLA.

blog-cloudhealth-secure-state-dash4

Raise a ticket / an issue with a ticketing system of your choice

Integrate with Jira cloud to automatically create issues in Jira Cloud, or via Webhook – which will allow you to integrate with any messaging or ticketing system. You can set up CloudHealth Secure State to automatically create a ticket and collaborate with the team responsible to fix the issues.

Remediate findings

For findings with high impact on business and high likelihood of occurrence, your goal as a Security Admin is to remediate these findings with the least mean-time-to-remediate. CloudHealth Secure State provides you with the ability to auto-remediate a rule. Anytime a new violation occurs, a finding will be triggered, and CloudHealth Secure State will direct the remediation worker to fix the violation.

blog-cloudhealth-secure-state-dash6

Avoid Alert Fatigue

To ensure your Cloud Security team doesn’t get overwhelmed with alerts, it’s important to follow some best practices:

Ensure that the rules you have set up are truly internet exposed. If you have an EC2 instance allowing unrestricted protocol access, ensure that the instance has a Public IP address associated with it and a route to the internet.

blog-cloudhealth-secure-state-dash1

CloudHealth Secure State allows you to build custom rules using explore. You can trigger high priority alerts using this contextual data, rather than having to investigate every single finding, and understand its network exposure.

Start with fewer projects and scale up as you train your security analysts to investigate, triage, and fix violations.
Most importantly, ensure your notification channels are handled by appropriate teams. You can assign tags, personalize messages, and have separate channels for incidents with different priorities. This allows your security team to focus on high priority and critical issues.

When you begin your cloud security posture management journey, initially, you may be overwhelmed with the number of findings across all your accounts. CloudHealth Secure State enables you to implement a risk prioritization strategy to not only mitigate risk but also optimize and fine-tune your security and compliance response as per organizational requirements. If you’d like to speak to a CloudHealth Secure State expert and understand how we can help your organization automate and scale public cloud security, request a free demo today.

Where to begin?

Related Posts:

Related Articles

VMware Tanzu Transformer at AWS re:Invent 2023

Technical Deep Dive VMware Aria Operations for Networks 6.11

What's New for VMware Tanzu Transformer at VMware Explore 2023

Announcing Initial Availability of VMware Aria Migration

Crown Jewel Analysis in VMware Aria Operations for Networks