In the slang of modern computing, the term ‘container’ is extensively used in different contexts, but when it comes to finding a robust definition for it, things get difficult. If we search the internet for such a definition, we will find different versions that typically talk about two loosely coupled aspects: a “form of operating system virtualization” or a “package of software containing all you need to run a program.”
At the same time, if we want to be fair, we have to admit that for many developers, the meaning of container is as simple as ‘this thing that Docker creates.’ At first glance this may sound ridiculous, but as we will see later, such a definition is not so far from the truth. The problem here is that the so-called “container” exploits different capabilities of the Linux kernel, but from the point of view of the kernel, there is no such thing as “container.” What we mean when we say “container” is a collection of independent kernel features, which only when used together, create in user-space an illusion of isolation (a.k.a. virtualization).
Namespaces and Partitioning of Resources
So let’s determine the major ingredients of a container and if it’s possible to discover the containers running on a machine, using only the instruments provided by the Linux kernel itself.
First of all, we will see why the container is considered a form of virtualization. For this we have to familiarize ourselves with one basic kernel primitive — ‘namespace.’
Here is what we find in Wikipedia about namespaces: “Namespaces provide partitioning of kernel resources such that one set of processes sees one set of resources while another set of processes sees a different set of resources.” There are 8 namespace types available on Linux: Mount, PID, Time, User, UTS, Cgroup, IPC, and Network. To better understand how this partitioning of resources works, let’s use the PID namespaces as an example.
There is one default PID namespace that silently exists on any machine running Linux. This namespace gives a unique ID number assigned to each process running on the system, namely the PID number that we all know very well. The PID number is the resources this namespace deals with. But what will happen if we move one process to a new PID namespace that is different from the default one? Now, this process will be alone inside the new namespace, hence there it will have PID=1 and all its child processes will be created there and will have PID values 2, 3, 4, etc., accordingly. All the processes inside the new namespace will be unable to find a PID number for any process that is outside of the namespace. This will make the outside processes effectively invisible from inside of the namespace. But the opposite is not true and from outside of the new namespace, all processes inside it will look normal with normal PID values (not 1, 2, 3, 4…).
Identifying Processes in the Namespaces
Now, when we know about namespaces, let’s see how we can find that a process is inside a container. For this we will use another iconic component of the Linux kernel — the proc pseudo-filesystem. It is commonly mounted at /proc and provides an interface for getting information directly from the kernel. Of particular interest to us will be the sub-directory
/proc/[pid]/ns/ where we can find pseudo-files named after each of the 8 namespace types. For example, if we want to examine the process having PID=42, we can write in terminal:
sudo readlink /proc/42/ns/pid
which will output something that looks like this:
Pid:[4026531836]
Now we know that process 42 is in the namespace 4026531836. Unfortunately, this is not very satisfying information, because this quasi-random number cannot tell us if this is the default namespace or a new one. To answer this question, we can compare this ‘magic’ number with the number that we will get if we do the same, but for the instance of the shell that we use.
readlink /proc/$$/ns/pid
Imagine that this gives you a different number. We know the shell belongs to the default namespace, hence process 42 must be in its own PID namespace. Does this mean it is a container? Unfortunately, no. If you make such comparison a bit more systematically for more processes running on your Linux laptop, you will find out that for example, the Chrome web browser creates quite a complicated structure of nested PID namespace. This is used to isolate the individual tabs, running different websites. The same goes for Slack and many other programs.
But let’s imagine you did make such a comparison not only for the PID namespace but for all 8 existing types and found that process 42 lives in its own instances of most of these 8 types. Then the chances are that this is indeed a container. You can find the command name associated with the process that runs inside this container by simply doing:
sudo cat /proc/42/comm
And yes, from the point of view of the kernel, there is nothing really special about this user-space process.
What Is a Container, After All?
I have to admit that the picture we drew so far is oversimplified. We barely touched 1 of the 8 namespaces. We haven’t mentioned cgroups — another kernel feature that provides control over the amount of resources that can be used by a given process. We haven’t mentioned the Overlay filesystem. When used in conjunction with the Mount namespace, this file system provides what is usually described as a “package of software” in the standard definitions of container.
There are so many different ways one can use each of the Linux kernel features that are the core of what we all agreed to call a container. And there are so many different ways to combine the usage of these individual features. So coming back to the question: what is a container? Well, it is not a single thing. Or maybe, it is just THE thing that your container engine creates.
Stay tuned to the Open Source Blog and follow us on Twitter for more deep dives into the world of open source contributing.