Control Groups and Capabilities
Let's look at control groups and capabilities in Linux
We'll cover the following
Control groups
If namespaces are about isolation, control groups (cgroups) are about setting limits.
Think of containers as similar to rooms in a hotel. While each room might appear isolated, every room shares a common set of infrastructure resources — things like water supply, electricity supply, shared swimming pool, shared gym, shared breakfast bar, etc. Cgroups let us set limits so that (sticking with the hotel analogy) no single container can use all of the water or eat everything at the breakfast bar.
In the real world, not the hotel analogy, containers are isolated from each other but all share a common set of OS resources — things like CPU, RAM, network bandwidth, and disk I/O. Cgroups let us set limits on each of these so a single container cannot consume everything and cause a denial of service (DoS) attack.
Capabilities
It’s a bad idea to run containers as root
— root
is all-powerful and therefore very dangerous. But, it can be challenging running containers as unprivileged non-root users. For example, on most Linux systems, non-root users tend to be so powerless they’re practically useless. What’s needed is a technology that lets us pick and choose which root powers a container needs in order to run.
Enter capabilities!
Under the hood, the Linux root user is a combination of a long list of capabilities. Some of these capabilities include:
CAP_CHOWN
: Lets you change file ownershipCAP_NET_BIND_SERVICE
: Lets you bind a socket to low numbered network portsCAP_SETUID
: Lets you elevate the privilege level of a processCAP_SYS_BOOT
: Lets you reboot the system.
The list goes on and is long.
Docker works with capabilities so that you can run containers as root
, but strip out all the capabilities you don’t need. For example, if the only root privilege your container needs is the ability to bind to low numbered network ports, you should start a container and drop all root capabilities, then add back just the CAP_NET_BIND_SERVICE
capability.
This is an excellent example of implementing least privilege; you get a container running with only the capabilities required. Docker also imposes restrictions so that containers cannot re-add the dropped capabilities.
While this is great, configuring the correct set of capabilities can be prohibitively complex for many users.
Get hands-on with 1200+ tech skills courses.