Cloud Comparison Cloud Security Migration

Navigating IAM and Security Models in Public Clouds

The ubiquity of cloud services has ushered in an era of accelerating digital innovation, improved efficiency, and scale once considered impossible. But as with all stories of rapid progress, there is a dark side to this tale: we are no longer seeing innovation as a growth factor, but rather as a necessity to stay competitive. The race towards the next “Blue Ocean” has often been at the expense of security, interoperability, or flexibility.

IAM is one of the main areas of misconfiguration

For cloud security practitioners, understanding and getting enough visibility into Identity and Access Management (IAM) is a challenge, particularly in large enterprises where they need to support many teams using potentially different clouds. The differences and subtleties of the security models between cloud providers can be overwhelming. For example, different providers use common terms, like “role” and “policy” in various ways, even though these terms have deeply distinct meanings in the context of each provider.

To help you navigate the murky waters of cloud IAM, security models and their terminology, we’re providing a series of articles comparing and analyzing the models of the “Big Three” cloud providers that we natively support with our product, CloudHealth Secure State: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). In this series, we will discuss these different elements in separate installments:

  • Part 1: Who (the principals that are granted access)
  • Part 2: How (what privileges and permissions exist, and how their grants work)
  • Part 3: What (the objects that permissions apply to, as well as some bonus topics)

Part 1: Who (Principals)

Let’s start with defining some key terminology to avoid confusion, as multiple terms are in use to describe different concepts in differing cloud documentation. We’ll call “principal” any human or process that can use permissions to interact with cloud resources. This explicitly includes both “normal” users, such as cloud developers, architects, and administrators, as well as processes running or resources, such as EC2 instances, Azure applications, or GKE nodes. We will refer to such processes as “service principals” (which is mostly standard terminology) as they utilize their access to provide some automated service. We will use “permissions” and “privileges” interchangeably to refer to actions that principals can apply to resources. We will only use the terms “role” and “policy” in the context of a particular cloud provider, due to their loaded meaning in the cross-cloud context.

With that in mind, let’s shed some light on how AWS, Azure, and GCP implement principals. Our topics of interest for this part are: what types of principals exist in each platform, how do they authenticate, and how they can be managed.

AWS

Amazon Web Services is widely considered as the first “real” cloud provider. Due to its innovator position in many areas, Azure and GCP got to learn from the (necessary) mistakes of the AWS team. This is to explain why AWS will consistently be painted as the odd one out in this blog post series–it’s not that  AWS did things differently as much as the other two picked up what worked well and changed what didn’t.

In terms of principals, AWS is certainly the odd one–only in AWS do user accounts belong to a particular cloud account–as opposed to the cloud platform itself.

Let’s break down that point in greater detail: In AWS, when you register a cloud account, you need to specify an email address. This email address becomes the identifier for the root user of this AWS account. This is an important difference–none of the other two providers has the concept of “root user” or anything of the sort–instead, they utilize a mostly flat user structure (see table below). The root user is important in other ways, as well: you cannot change the email address of the root user, nor can you promote another IAM user to be the new root user. Outside of their password and access keys, root users are essentially immutable.

Below the root users are the IAM users, groups, and roles–all these belong to the account and can be created, edited, or deleted freely.

In the AWS Console, IAM users log in through an account-specific login URL, providing a username and password combination, as the standard AWS email/password authentication flow is reserved only for root accounts. IAM groups are simply collections of IAM users: they’re used to easily manage and track permission assignments for larger teams as they (unsurprisingly) allow grouping users together. Thus, the permission grant can be towards the IAM group and administrators need only worry about who is in the group, rather than managing individual permissions of every IAM user.

This leaves us with one more pertinent question:what are IAM roles? In AWS, roles are an abstraction that allows other principals to assume them, granting any privileges beholden to the role itself. This is the crux of the AWS security model and the key difference with Azure and GCP–the AssumeRole/PassRole actions, and respective permissions, are central to how AWS services interoperate with your cloud account.

Let’s say you have an EC2 instance and you need it to be able to access an S3 bucket. You could (but really shouldn’t) create an IAM user, add an S3 policy to them, create an API key, and set up the AWS CLI with it inside the EC2 instance. However, there is a much better solution: you can assign the EC2 instance an IAM role that grants S3 privileges and sidestep the issue of key management entirely. In this instance, AWS makes the EC2 instance assume the role you’ve specified.

That is the core purpose of IAM roles. Anytime an AWS service needs to be granted permissions to call AWS APIs on your behalf, this happens through IAM roles. You cannot login directly to an IAM role–it happens through the AssumeRole API call–and in return you receive a short-lived token, rather than a persistent API key.

To summarize: AWS roles are the preferred mechanism by which processes (such as code, running on EC2, EKS, or Lambda) can gain access to AWS APIs, but they are not exclusive to services and can be assumed by IAM users or the root user, as well.

screenshot A combined view of all  AWS principals for a test account.

This is a good moment to mention the how authentication works in AWS. There’s two general ways of getting access to an AWS root or IAM user: through the web portal using your email (for root) or username (for IAM), your password, and potentially multi-factor authentication (MFA)–or through programmatic means using an API key, which has long validity and no MFA. Service principals instead do not use any authentication secrets and are provided ephemeral access tokens to their respective role by the AWS cloud platform.

Azure

Microsoft has built a simpler identity model than AWS–all users that have ever had access to any of the Azure services have done so through their Microsoft account. The login page is the same for all Azure users, regardless of the accounts they have access to.

Microsoft, however, distinguishes between personal and work/school accounts–personal accounts are self-managed, while work or school accounts belong to an Azure Active Directory (AD) tenant. This allows the organizational administration to manage user details, such as phone number, backup email, and others.

It is important to note that Azure AD has almost nothing in common with Azure. Azure AD is the centralized authentication provider of Microsoft, while Azure is the cloud platform. Azure AD has a different set of privileges and Azure AD administrators don’t automatically have access to Azure, nor vice versa.

The short explanation for this is the wider application of Azure AD–it’s also used to control users work emails and productivity suites (Microsoft 365) or its SaaS ERP/CRM (Dynamics 365).

Azure AD also has the concept of groups, including dynamic membership (aptly named “dynamic groups”). However, roles are missing as a type of principal–Azure, too, has a term “role” but it refers directly to a named set of privileges–rather than some abstract subject that can be assumed.

Service principals also exist in Azure. Machines and processes in Azure can perform API calls to the platform through a mechanism called “Managed Identity” (MIs). There are two types: system-assigned and user-assigned.

System-assigned MIs are an inherent property of some Azure resources, such as a virtual machine. These identities are automatically provisioned and their authentication is opaque without requiring any settings. The only option is a single on/off switch to control whether the system-assigned identity is to be activated or not.

The other type of MI is user-assigned. User-assigned MIs are objects, created by an Azure user that can be assigned to resources that support managed identities and function otherwise very similar to their system-assigned counterpart.

Finally, a few words about the details of authentication. All Azure logins are done through OAuth flows, including using Azure CLI. When authenticating to the CLI from a new machine, you will be asked to open an URL containing a code in your browser and approve the login request. This removes the discrepancy we mentioned is present in AWS between API keys and passwords as authentication factors. Here, everything is standard OAuth.

GCP

The GCP authentication model is very similar to the Azure one–users do not belong to a particular GCP account, but rather to the Google platform. The login page is the same for all users and the email address is the unique identifier for users, and even for service accounts. They simply have email addresses in a subdomain of gserviceaccount.com.

Many authentication features, such as groups and direct user management (somewhat akin to Azure AD), however require a GCP organization, which is the parent object of a project.  Like Azure and unlike AWS, roles exist but are not a type of principal.

Service accounts come in three types:

  • User-managed service accounts, that are created and managed by the user
  • Default service accounts that are created for specific resource types, such as VMs
  • Google-managed service accounts, which are created by Google to interact with your GCP resources whenever necessary

Google-managed service accounts are sometimes hidden in the menus and handle important and fundamental aspects of GCP, such as partake in enabling new APIs when first used. It is important these service accounts do not lose the permissions they need, otherwise some GCP services might become inaccessible or unusable within the project. Code running on applicable GCP resources can implicitly authenticate to the default service account of the resource, or, with a small tweak, as any other user-managed service account, provided the relevant credentials.

screenshot A graph view of all service accounts within a GCP account with all related resources added for one service account.
A graph view of all service accounts within a GCP account with all related resources added for one service account.

This leads us to authentication, which is OAuth-based and emphasizes two parts–both the Application and the Principal need to be authenticated. Application credentials can be API keys, OAuth 2.0 client credentials, or service account keys. API keys are interesting, since they provide a metered (and therefore, billable) way to access the GCP “public” functionality and will not work for more potent APIs, such as GCE (Compute Engine).

OAuth 2.0 client credentials are usually used when an application calls GCP APIs on behalf of its end user. This could be your application, be it internal or external, or it could be a Google-provided application, such as the “gcloud” GCP CLI.

Finally, service account keys are used when you need to authenticate (usually code) as a service account. As discussed above, keys for default service accounts are automatically fetched through the metadata service, while keys for user-managed service accounts need to be manually provided.

Summary and closing remarks

Let’s take a moment to reflect on what we’ve learned so far and compare different types of principals:

IAM is at the very core of cloud security and the impact of any IAM security misconfiguration can be huge. IAM visibility and manageability is a concern for everyone involved in operational cloud security, potentially all the way up to the CISO. Another point of interest is compliance: user, role and permission auditing and reporting capabilities play an integral part in checking off various compliance requirements, such as validating no unnecessary access has been provisioned (the principle of least privilege). Due to these (and other reasons), we’re extremely interested in empowering organizations better manage their cloud IAM.

For this first part of our blog post series, we covered how AWS, Azure, and GCP handle IAM principals, as well as authentication and management tools, such as groups. For the next installment of the series, we will take a look at resources and privileges, as well as how they are organized and how they relate to principals.

If you’re looking for even more information, get in touch with us directly. Our team would be happy to answer questions and walk you through the capabilities of the CloudHealth Secure State platform.