Cloud Security Migration

A cheat sheet to permissions management models in AWS, Azure, & GCP

Welcome to the second part of the blog series on exploring the differences in IAM and Security Models for the “Big Three” public cloud providers.  This journey started with a blog, focused on what types of principals exist in each provider, was accompanied by a quick overview of the organization of these principals and the authentication mechanics, or for short – the “Who” part of the IAM equation.  In this blog, we’ll look at the “How” part of things – what permissions exist in each provider, how these permissions are structured, and how they get assigned to principals. In the third installment of the blog series, we’ll cover the “What” – resources, their structure, and organization, as well as some other related topics.

AWS Permissions

Permissions (also called “Actions” in the context of AWS) are generally related to the available API operations. For example in order to create new EC2 instances, one would need to perform an API call to the RunInstances (or `aws ec2 run-instances` if you prefer the AWS CLI) which requires the eponymous ec2:RunInstances permission. Permissions always follow this formula: the name of the AWS service in lower case and the API action in camel case (or more precisely InitCaps), separated by a semicolon. There is also a rudimentary classification of these permissions: most services contain permissions in the List, Read, Write, Tagging, and Permissions management categories, but there isn’t any obvious mapping from permissions to categories for instance, `ec2:DescribeAccountAttributes` is in the List category, while `ec2:DescribeElasticGpus` is in the Read category. Permissions whose actions have side effects are almost always found in the Write category, though.

Let’s look at one of the most important AWS permissions (and a source of continued headaches for security practitioners as it is about as powerful and tricky to use, as an always-on lightsaber – sts:AssumeRole. In the last blog we briefly mentioned that AWS roles are unlike any other beast: both Azure and GCP have roles, but they function like permission sets (more on this in the Azure and GCP sections below). AWS roles, however, are principals in their own right (even though they lack API keys or passwords) and assuming these roles is an active API operation that has side effects, requires a Role Session Name and is subject to auditing. Well, AssumeRole is this aforementioned API call that allows the caller to receive permissions of the role. The reason why this permission is so dangerous is that granting an unintended role to the incorrect user (or AWS resource) can quickly spiral into a complete account takeover.

Things quickly get even hazier, once we get to the most “magical” AWS permission – `iam:PassRole`. This permission is the secret glue that largely holds the AWS platform together – you’ll see it mentioned in the documentation pages of multiple other API calls, and yet it doesn’t have a corresponding API call in the documentation for IAM permissions. The reason for this seemingly weird inconsistency is that the PassRole permission is interwoven in the platform – it provides the ability to allow an AWS resource to assume a customer-created role and to interact with your other AWS resources.

A simple example would entail launching an EC2 instance and attaching an IAM role (in the form of an Instance Profile) to this instance: In order to simply launch the instance, one needs only the `ec2:RunInstances` permission, but in order to attach an IAM role through an Instance Profile to this instance, the `iam:PassRole` permission over this IAM role is necessary.

IAM Policy

The next important piece of the puzzle is the IAM Policy. In AWS there are multiple types of policies, but we’re only interested in IAM policies, which are of two types: Identity-based policies and Resource-based policies. The native format of AWS policies is JSON, but the AWS Console provides a Visual Editor for those policies, which allows easier construction and editing. A policy can contain multiple statements, with each statement specifying one or more permissions (in the “Action” field) and whether these permissions are to be granted or denied (in the “Effect” field, whose values can be “Allow” and “Deny”). Statements can optionally contain the “Condition” field, which provides a rudimentary mechanism for evaluating various contextual factors, such as current time, more complex resource filters and others. Unfortunately, conditions is too big of a topic to cover here.

Finally, the policy statement needs to specify to what principal (the “Principal” field) or to what resource (the “Resource” field) it applies. Each of those fields except “Effect” can contain wildcards in the form of the asterisk character, which matches any string (and what is lesser-known – the ‘?’ symbol which matches any single character).

This is where the differences come into play:

Identity-based policies are applied to principals – users, groups, and roles. There are two subtypes of identity-based policies: inline and managed policies. Inline policies are a part of their respective principal, while managed policies are separate objects in their own right and can be attached to principals. AWS provides some managed policies, but users can create their own managed policies, as well. All identity-based policies need to include the “Resource” field in their statements, as it specifies which resources the statement applies to. Going back to our earlier example, in order to launch an EC2 instance with instance profile containing the role `arn:aws:iam::123456789012:role/EC2Role`, one needs to have an identity-based policy like this one:

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Action": "ec2:RunInstances",

"Resource": "*"

},

{

"Effect": "Allow",

"Action": "iam:PassRole",

"Resource": "arn:aws:iam::123456789012:role/EC2Role "

}

]

}

Resource-based policies on the other hand are always a part of the resource to which they are applied – there’s no managed policies for resources. Unlike identity-based policies, statements within resource-based policies need to include the “Principal” field, as they specify what principals are granted or denied the permissions listed in the “Action” field. Resource-based policies might also contain the “Resource” field in their statements as well, as it can sometimes provide additional context, such as what objects within an S3 bucket are covered by the S3 bucket resource policy.

A natural question to ask by now is how do policies and their statements combine? For example, what happens when there is a statement with Effect = Allow and a statement with the same Resource and Action fields, but with Effect = Deny? The answer to this question is that ‘Deny’ trumps ‘Allow’: when evaluating access control decisions, if there are any Deny statements covering the target resource and action, the request is always denied. Of course, if there are no ‘Allow’ statements that cover the target resource and action, the request is still denied.

“Default Deny” is the law of the land in AWS and all other cloud providers; permissions always need to be granted explicitly.

There’s one more detail that needs to be ironed out: ‘Deny’ statements, applicable to IAM roles do not apply to users or roles that can assume that role. This is to be expected – after all roles in AWS are not assigned, instead, the permission to assume these roles is assigned. Therefore, unless a user has assumed a role, negative permission grants attached to this role do not apply to the original user.

Finally, it makes sense to wonder whether identity-based or resource-based policies are prioritized over the other? The answer is that both types of policies are on equal footing: when access control decisions are made, AWS combines all relevant policy statements from both identity and resource policies and evaluates them all together. That means you only need a single Allow statement to be able to perform an action: either from an identity-based policy attached to the principal, or from a resource-based policy, attached to the target resource.

There is one exception: Roles. When using the `sts:AssumeRole` permission, one needs both an identity-based policy that allows assuming the role, AND the resource policy of this role (shown as “Trust Relationships” in the AWS console and referred to as “Trust Policy” in Cloud Health Secure State) to allow the assumption of the role by the calling principal.

IAM roles

 

There’s a lot more nuance involved, and we’ll cover some of the missing details in upcoming blog articles, but when in doubt, you can always consult the AWS Policy Evaluation Logic.

Don’t worry if this feels a bit too much, AWS easily has the most difficult IAM and security models to wrap your head around. Things get much easier from now on.

Azure Permissions

Thankfully, Microsoft Azure has a much more straightforward approach to permissions management. In Azure, permissions (also called “Actions” in Azure docs) correspond to available API operations, but much more loosely – often multiple API operations are covered by single permission. Continuing our previous example of launching a VM in the cloud, the respective Azure permission is `Microsoft.Compute/virtualMachines/write`.

The implication of hierarchy isn’t an accident – permissions are (somewhat implicitly) organized in a tree-like structure: the first part is the Azure resource provider (in this case Microsoft.Compute), followed by the service (Virtual Machines, here represented as `virtualMachines`), any additional qualifiers (there are none in this example), and, finally, the operation that is to be executed (in this case `write`).

There’s also a CRUD-like structure for many (but not all) of the permissions:

  • create-like and update-like operations require a permission ending in `/write`
  • read-like operations have their respective permissions ending in `/read`
  • delete-like operations have permissions ending in `/delete`, and
  • any other action that has side-effects will have its permission string end in `/action` (there’s a lot of those available – it seems that cloud operations don’t map that easily to CRUD).

In case you intend to rely on suffix testing, it is good to know about an inconsistency here

– some delete permissions end in `/delete`, while some end in `/Delete`.

This is a good time to mention that there are two fundamentally different types of operations in Azure, with different types of permissions:

  1. Control actions, which generally affect resources themselves, and
  2. Data actions, that affect the data within those resources.

The official documentation on this topic distinguishes these as control plane actions and data plane actions.  For example, creating a VM is a control plane action, governed by the `Microsoft.Compute/virtualMachines/write` permission, as mentioned above. On the other hand, logging into a VM is a data plane action, as it has effect on the resource data, in this case giving us access to the VM, and is governed by the `Microsoft.Compute/virtualMachines/login/action` permission.

IAM Roles

Analogous to AWS, there is asterisk wildcard support, and combined with the suffix categorization, it results in some exceptionally simple (and admirably so) role definitions: the Reader role includes the permissions that match the `*/read` wildcard, and the Owner role has its wildcard string as just `*`.

We mentioned role definitions in the last paragraph, and now would be a good time to take a closer look at them. A role can be either be Built-In, or Custom. Built-in roles are provided and maintained by Microsoft and are present in each Azure subscription. As expected, you cannot modify Built-In roles. Custom roles, instead, are created and managed by the customer. Role definitions are, like in AWS, natively stored in JSON format, and can also be edited visually through the Azure Portal.

Within a role definition, the “Actions” field contains a list of all control plane permissions (or, more generally, permission strings that can include wildcards) that will be available to any principal that is assigned the role. Similarly, the “DataActions” field contains a list of all permission strings for data plane permissions that will be available to related principals.

There is no “Effect” field – role assignments in Azure can only add permissions, but not remove them from a principal (there is a mechanism for negative permission grants – policies – which we’ll cover in the third part of the blog post). Instead, users are provided with the “NotActions” and “NotDataActions” fields, which will be subtracted from the permissions listed in the “Actions” and “DataActions” fields, respectively. Any permissions listed in “NotActions” and “NotDataActions”, however, do not transcend the role definition itself: if the principal is provided a permission from role B that is assigned to them, they will have the right to perform its corresponding action, no matter the contents of the “NotActions” or “NotDataActions” of role A.

Example of a Role Definition

Let’s illustrate this with the following example role definition (you can obtain the native JSON representation of a role named “Test” within subscription ID “abcdefgh-ijkl-mnop-qrst-uvwxyz012345” by running `az role definition list –name Test –subscription abcdefgh-ijkl-mnop-qrst-uvwxyz012345`, just as we have done here):

[

{

"assignableScopes": [

"/subscriptions/abcdefgh-ijkl-mnop-qrst-uvwxyz012345"

],

"description": "Illustrative role definition",

"id": "/subscriptions/abcdefgh-ijkl-mnop-qrst-uvwxyz012345/providers/Microsoft.Authorization/roleDefinitions/ce94b70f-827b-418d-83ee-ea411bf061e5",

"name": " ce94b70f-827b-418d-83ee-ea411bf061e5",

"permissions": [

{

"actions": [

"Microsoft.Compute/virtualMachines/write"

],

"dataActions": [],

"notActions": [

"*"

],

"notDataActions": []

}

],

"roleName": "Test",

"roleType": "CustomRole",

"type": "Microsoft.Authorization/roleDefinitions"

}

]

As discussed above, even though the “NotActions” field contains ALL permissions and thus negates any permissions listed in the “Actions” field, the result is that this role simply does not provide any permissions, without preventing the associated principals from receiving permissions from other role assignments.

We haven’t mentioned a field that corresponds to the “Resource” field in AWS policies, and indeed there is none – the applicability of roles to resources is determined by the scope at which the role is assigned. Scopes is another topic that we are going to cover in the later installment of the blog post series, but the gist is that resources in Azure are intrinsically organized in a hierarchy and the scopes represent “groups” of resources. Thus, the resource to which a role applies its permissions is not intrinsic to the role definition, but rather to the role assignment, which binds together roles and principals. What is present, however, is the `AssignableScopes` field, which specifies the root scopes at which the role can be assigned. It is worth highlighting the ‘root’ qualifier here – there is a hierarchy of scopes, and if a particular scope is mentioned as an assignable scope, all its children (in the scope hierarchy) can also be applied to a role assignment as its scope. Again, we will take a closer look at scopes in the next installment of this series, but role assignments are surprisingly simple – they consist of the role, the principal, and the scope.

Finally, the “Condition” field allows specifying conditions for granting the access specified in the “Action” field. Just like in AWS, conditions here is a big topic and will need a separate discussion.

GCP Permissions

The final spot in our permission management overview is rightfully beholden to the Google Cloud Platform. To summarize, GCP sits somewhat in between the powerful (but undeniably dangerous) IAM model of AWS and the relatively straightforward approach of Microsoft’s Azure. In GCP you can very often get by using the simple primitives that are at an arm’s reach. However, you are also provided the tools to dig deep (or shoot yourself in the foot) with the intricate complex toolkit that is available under the surface, such as `iam.serviceAccounts.signBlob` or `iam.serviceAccounts.actAs`.

In GCP, like Azure and AWS, permissions correspond to GCP cloud API methods. And like Azure, GCP has permissions made up of components, joined by a separator “.”

The components are API name, resource type and action, e.g. the `compute.instances.create` permission grants access to creating new GCE (Google Compute Engine) VM instances and its components are `compute` (API name), `instances` (resource type), and `create` (the action).

In another similarity to Azure, many permissions in GCP follow the CRUD structure, with the following mapping –

  • create actions require a `.create` permission
  • read actions require a `.get` permission
  • update requires the corresponding `.update`, and
  • delete requires the corresponding `.delete` permission.

There are a lot more permissions available, such as `attachDisk` and `detachDisk` for the `compute.instance` resource type.

Roles also serve a similar function as in Azure: they are positive in nature which means that they can only grant permissions and they are not a principal, but merely a collection of permissions. GCP roles do not have a native representation – in the GCP platform all objects can be represented in either JSON or YAML, and roles are no exception. There are three types of roles in GCP:

  1. Basic roles: Owner, Editor and Viewer. These are legacy roles, which include thousands of permissions, and the official recommendation is to avoid them.
  2. Predefined roles: Google provides these and there are several hundreds of those, some being scoped-down versions of the basic roles for a particular service, such as the “Compute Instance Admin (v1)” role (`roles/compute.instanceAdmin.v1`). Google also maintains these roles, and they will be updated with relevant changes, when necessary
  3. Custom roles: GCP users can create, update, delete these roles, as well as assign them to other GCP users (or themselves for that matter, a clarification that is relevant in privilege escalation attacks).

In GCP, there are no wildcards in permissions, so you must list all permissions by name if you want to provide them to a new role. For probably similar reasons, there isn’t any equivalent to Azure’s NotAction, nor is there an equivalent to the AWS Action = Deny policy statements: All role permissions in GCP are given to any users assigned to the role.

There is no AssignableScope/Resource field: all permissions are applied to all resources in the current project (this is an oversimplification we will correct later when discussing resource organization in GCP).

The fields that are there are: `name` (which is more akin to ID), `title` (which is more akin to name), description (which functions exactly as expected), `stage` (which is meant to note the maturity of the role – the possible values are Alpha, Beta, General Availability and Disabled), and, finally, `etag`, which is a version control field. This version control field exists to prevent race conditions when updating roles by providing optimistic concurrency control – update requests need to provide this field and it must equal the etag of the latest version of the role prior to the update. Here’s what a role looks like from the native `gcloud` CLI:

CHSS code

You may have noticed that conditions have been absent from our discussion of GCP roles too. Indeed, there is no way to specify conditions in roles, but there is a way to specify conditions when assigning a role to a user. The available conditions are somewhat rudimentary and include only 6 parameters, covering `Time` and `Resource` as categories to provide information, but they can be combined logically, which can result in more complex behavior.

Summary and closing remarks

This feels like a good moment for some retrospection. We took a deep dive into the AWS way of doing permissions, only to discover it is rather complex and riddled with exceptions, asterisks, and surprises, which in the realm of security is almost always a bad thing. We speculated on the reason for this in our last blog – AWS is widely considered to be the pioneer of cloud tech, and as the reigning champion of the previous race (to the cloud), it is now handicapped by the legacy of abstractions that leak or don’t make sense in the current cloud best practices. Azure streamlines this process by entirely removing roles as a type of principal, but provides several conveniences – NotAction, the distinction between Data plane and Control plane actions, the concept of scopes. Finally, there is the GCP IAM model, which forgoes almost every element that can be reasonably omitted: Roles are just a named list of permissions, and when attached to a user, apply the granted permissions to all resources in the project.

td {
vertical-align: baseline;
}
tr:nth-child(even) {
background: #efefef;
}

As we’ve seen already, permissions management can be quite the hurdle and is very different across the three major cloud providers, necessitating a lot of attention to detail when setting up the IAM aspect of a cloud account. These differences are also a significant barrier to cloud-agnostic access control models, as the intricacies of each provider need to be accounted for to create a secure deployment.

In the next installment of this series, we will shed some light on the way resources themselves are organized and managed across AWS, Azure and GCP.

If you’re looking for even more information, get in touch with us directly. Our team would be happy to answer questions and walk you through the capabilities of the CloudHealth Secure State platform. To speak to an expert from VMware’s CloudHealth Secure State team, or to book a free demo, visit us here.