A lot of effort has been put into kubeadm, an open source project focused on providing best-practice fast paths for creating Kubernetes clusters. While the upstream documentation for bootstrapping clusters with kubeadm is very good, it doesn’t cover use cases involving a cloud provider—there are a lot of different variables that would need to be discussed, and with the changes around the way cloud provider integration is implemented, bootstrapping with kubeadm can be complex. This blog post builds upon the upstream documentation to show how to bootstrap a highly available Kubernetes 1.15 cluster with the in-tree AWS cloud provider integration.

Why would you want AWS cloud provider integration? The value of using the AWS cloud provider when running Kubernetes on AWS is that you get automatic creation of Elastic Load Balancers (ELBs) in response to the creation of Service objects of type LoadBalancer. You also get Elastic Block Store (EBS) integration for Persistent Volumes and Persistent Volume Claims. In short, the integration streamlines the use of AWS resources by Kubernetes.

When it comes to using kubeadm to bootstrap a cluster that will use the AWS cloud provider, there are four considerations to keep in mind:

  1. Node hostnames
  2. AWS API permissions through IAM roles and policies
  3. Resource tags
  4. Configuration files for kubeadm

The following sections take a look at each of these four considerations in a bit more detail.

Node Hostnames

Currently, the AWS cloud provider uses the EC2 Private DNS entry as the node name. This means that you will need to ensure that the operating system (OS) running in the instance, such as Ubuntu or CentOS, has its hostname configured to match the EC2 Private DNS entry. Normally, the EC2 Private DNS entry for an instance looks something like ip-10-11-12-13.us-east-1.compute.internal, where 10-11-12-13 is the private IP address of the instance and us-east-1 is the region in which the instance is running.

The fastest and easiest way to ensure this is the case is to use the following command:

hostnamectl set-hostname \
$(curl -s http://169.254.169.254/latest/meta-data/local-hostname)

This command sets the local hostname to match the hostname specified in the EC2 instance metadata, which will correspond to the EC2 Private DNS entry.

You must set the hostname appropriately before bootstrapping the cluster with kubeadm. Otherwise, the Kubelet will use the wrong name as part of the node registration process, causing the AWS cloud provider not to function properly (the node names expected by the AWS cloud provider won’t match the name of the node actually registered in the cluster).

AWS API Permissions via IAM Roles and Policies

The function of the AWS cloud provider is to, among other things, automatically configure AWS objects (like ELBs or EBS volumes) in response to events within the Kubernetes cluster. For the cloud provider to be able to automatically configure AWS objects and services, the EC2 instances must have permission to access the AWS API and make changes on behalf of the Kubernetes cluster. Allowing this access is accomplished through the use of IAM roles, policies, and instance profiles:

  1. You must define an IAM role. See the AWS documentation for more information on IAM roles.
  2. You must define an IAM policy that grants permissions, and associate that policy with the IAM role created in the previous step. Again, refer to the official AWS documentation on IAM policies.
  3. Finally, EC2 instances must be assigned an IAM instance profile that allows them to assume the IAM role created in step 1, which in turn grants those instances the permissions in the policy created in step 2. More information on IAM instance profiles is available here.

The GitHub repository for the AWS cloud provider has full details on the permissions that must be granted with an IAM policy, including the permissions needed for control plane nodes and worker nodes. (Control plane nodes and worker nodes require different permissions.)

Once the IAM policy, IAM role, and IAM instance profile are created and in place, you must be sure to specify the correct IAM instance profile that EC2 instances should use when those instances are created through the console, the CLI, or some other infrastructure-as-code tool.

Resource Tags

Resource tags are used by the AWS cloud provider to discover the resources available to a given Kubernetes cluster. The AWS cloud provider has a specific resource tag you should use. This tag should be kubernetes.io/cluster/, where is the name of the Kubernetes cluster (that name is set using a kubeadm configuration file, as described in the next section). The value of the tag is immaterial, although the cloud provider itself will use the values “shared” and “owned” for the resources it creates.

In addition to the EC2 instances, you should ensure this tag is applied to a security group (a group of which the EC2 instances are a member), all subnets, and all route tables.

Failure to properly tag resources with this tag will result in some odd failure conditions; for example, in response to the creation of a Service of type LoadBalancer, the AWS cloud provider might create an ELB but fail to properly add the instances behind the ELB. If you encounter such situations, verifying the tags on the AWS resources is a good first step in troubleshooting.

Configuration Files for kubeadm

Once all the prerequisites have been met—EC2 instances are launched with the correct IAM instance profile and resource tags, subnets and route tables have been properly tagged, and the OS hostnames have been set correctly—then you are ready to build some kubeadm configuration files for bootstrapping the cluster. Configuration files are how you customize virtually every aspect of how Kubernetes is configured through the kubeadm API.

This section shows how configuration files for kubeadm (leveraging features like the extraArgs capability to add command-line arguments to the control plane components) are used to enable the AWS cloud provider when a cluster is bootstrapped. (If you’re interested in more details on the kubeadm API, visit the documentation for the API.)

The First Control Plane Node

Bootstrapping the first control plane node is the only time in this process where you will use kubeadm init. As such, the kubeadm configuration file for the first control plane node is a bit more complex than the configuration file for the consecutive control plane nodes or the worker nodes.

Here’s an example kubeadm configuration file for bootstrapping the first control plane node:

Naturally, this is just an example—you need to change this content to use it. Here are some of the values you should change in order to use this configuration file in your environment:

  • Change the value of clusterName to match the specified in the AWS resource tags. This enables the AWS cloud provider to look up resources in AWS correctly.
  • Change the value of controlPlaneEndpoint to match the DNS CNAME or DNS entry for the load balancer created for the Kubernetes control plane.
  • If you need a specific CIDR block for Pods, change the value of podSubnet. (The value shown here is for an installation using the Calico CNI plug-in.)

Once this file has been customized appropriately, you can run kubeadm init --config=kubeadm.yaml (changing the filename as needed, of course). This command will bootstrap the first control plane node. This command will output some very important information, including the commands to join additional control plane nodes or worker nodes to the cluster. Copy these commands down but do not use them! You will need information from those commands to create the kubeadm configuration files for additional control plane nodes and worker nodes (see the following sections).

After you have verified that the first control plane node is up, you can proceed to install a CNI plug-in (refer to the documentation for that particular CNI plug-in for details).

Additional Control Plane Nodes

Kubernetes releases 1.14 and 1.15 added significant functionality to kubeadm to dramatically streamline the process of adding control plane nodes to an HA cluster once the first control plane node is bootstrapped. This new functionality means that joining control plane nodes is now a straightforward kubeadm join command.

To enable the AWS cloud provider and join additional control plane nodes, you can use a kubeadm configuration file like this one:

As before, there are values in this example that must be customized before you can use it in your environment:

  • The value for the token field is contained in the output of the kubeadm init command from the previous step. If you didn’t capture that information or it has been more than 24 hours since the first control plane node was bootstrapped, you can use kubeadm token create to create a new one and specify that value here.
  • The value for apiServerEndpoint is the load balancer for the control plane, as specified in the previous section for controlPlaneEndpoint.
  • The caCertHashes value is also found in the output of the kubeadm init command from earlier. If you didn’t capture that information, this blog post shows how to get it.
  • The certificateKey value is also found in the output of the kubeadm init command from the previous section. If you didn’t capture that information or if it has been more than 2 hours since the first control plane node was bootstrapped, you can run kubeadm init phase upload-certs --upload-certs to generate a new certificate key and specify that value here.
  • The fully qualified domain name (FQDN) and IP address specified are those of the control plane node being added to the cluster. Each control plane node being added to the cluster will need its own configuration file.

Once this file contains the correct information from the output of bootstrapping the first control plane node and for the specific node being added, you just run kubeadm join --config=kubeadm.yaml (using the correct filename).

You can repeat this process to bring up two additional control plane nodes (for a total of three). At this point, an HA control plane has been established, and only worker nodes need to be added to the cluster.

Worker Nodes

Because worker nodes are added to the cluster using kubeadm join, the configuration file for adding a worker node is similar to the configuration file for adding a control plane node.

Here’s an example kubeadm configuration file that you can use to add a worker node to the cluster:

As with the kubeadm configuration file for joining a control plane node, most of the values needed for this value were contained in the output of the kubeadm init command used to bootstrap the first control plane node. Refer back to the notes in the previous section on how to get (or recreate) this information if the output of that first command wasn’t captured. The FQDN specified here should be that of the particular worker node being added (each worker node will need its own configuration file).

Once the configuration file is ready, worker nodes are added to the cluster by running kubeadm join --config=kubeadm.yaml (with the appropriate filename, naturally).

Wrapping Up

Hopefully, walking through this process shows that the functionality added to kubeadm in recent releases of Kubernetes has greatly simplified the process of setting up a highly available Kubernetes cluster. Even considering the steps involved in enabling the AWS cloud provider (setting hostnames, configuring IAM access, and tagging resources), the entire process for bootstrapping a Kubernetes cluster is still relatively straightforward. Of course, you can wrap most (if not all) of these steps into some sort of automation tool to make it easier to create AWS-enabled Kubernetes clusters, and future developments in the Kubernetes community (like Cluster API; see this blog post) will make it even easier.