Draining Worker Nodes
In this lesson, we will carry out a chaos experiment which will drain everything from a worker node.
The reasoning behind this experiment
We’re going to try to drain everything from a random worker node.
Why do you think we might want to do something like this? One possible reason for doing that is in upgrades. The draining process is the same as the one we are likely using to upgrade our Kubernetes cluster.
Upgrading a Kubernetes cluster usually involves a few steps. Typically, we’d drain a node, we’d shut it down, and we’d replace it with an upgraded version of the node. Alternatively, we might upgrade a node without shutting it down, but that would be more appropriate for bare-metal servers that cannot be destroyed and created at will. Further on, we’d repeat the steps. We’d drain a node, shut it down, and create a new one based on an upgraded version. This would continue over and over again, one node after another, until the whole cluster is upgraded. The process is often called rolling updates (or rolling upgrades), and it is employed by most Kubernetes distributions.
We want to make sure nothing wrong happens while or after upgrading a cluster. To do that, we’re going to design an experiment that would perform the most critical step of the process. It will drain a random node, and we will validate whether our applications are just as healthy as before.
If you’re not familiar with the expression, draining means removing everything from a node.
Inspecting the definition of node-drain.yaml
Let’s take a look at yet another definition of an experiment.
Get hands-on with 1400+ tech skills courses.