Draining Worker Nodes

Explore the process of draining worker nodes in a Kubernetes cluster to understand rolling updates and cluster upgrades. This lesson guides you through simulating node draining, validating application health, and identifying challenges with node disruption budgets that impact high availability in cluster management.

We'll cover the following...

- The reasoning behind this experiment
- Inspecting the definition of node-drain.yaml
- Describing the labels of nodes of the cluster
- Exporting the NODE_LABEL variable
- Running chaos experiment and inspecting the output
- Why couldn’t we drain the node?

The reasoning behind this experiment

We’re going to try to drain everything from a random worker node.

Why do you think we might want to do something like this? One possible reason for doing that is in upgrades. The draining process is the same as the one we are likely using to upgrade our Kubernetes cluster.

Upgrading a Kubernetes cluster usually involves a few steps. Typically, we’d drain a node, we’d shut it down, and we’d replace it with an upgraded version of the node. Alternatively, we might upgrade a node without shutting it down, but that would be more appropriate for bare-metal servers that cannot be destroyed and created at will. Further on, we’d repeat the steps. We’d drain a node, shut it down, and create a new one based on an upgraded version. This would continue over and over again, one node after another, until the whole cluster is upgraded. The process is often called rolling updates (or rolling upgrades), and it is employed by most Kubernetes distributions.

We want to make sure nothing wrong happens while or after upgrading a cluster. To do that, we’re going to design an experiment that would perform the most critical step of the process. It will drain a random node, and we will validate whether our applications are just as healthy as before.

If you’re not familiar with the expression, draining means removing everything from a node.

Inspecting the definition of `node-drain.yaml`

Let’s take a look at yet another definition of an experiment.

1.Introduction To Kubernetes Chaos Engineering

2.Defining Requirements

3.Destroying Application Instances

4.Experimenting with Application Availability

5.Obstructing and Destroying Network

6.Draining and Deleting Nodes

7.Creating Chaos Experiment Reports

8.Running Chaos Experiments Inside a Kubernetes Cluster

9.Executing Random Chaos

10.What’s Next?

Draining Worker Nodes

The reasoning behind this experiment

Inspecting the definition of `node-drain.yaml`

Draining Worker Nodes

The reasoning behind this experiment

Inspecting the definition of node-drain.yaml

Inspecting the definition of `node-drain.yaml`