Uncordoning Worker Nodes
In this lesson, we will carry out another chaos experiment, which will also include a rollback block so that we can uncordon the node after the experiment.
The issue we just created
There are a couple of issues that we need to fix to get out of the bad situation we’re in right now. However, before we start solving those, an even bigger problem was created a few moments ago. I will demonstrate the issue by retrieving the nodes.
kubectl get nodes
The output, in my case, is as follows (yours will be different).
NAME STATUS ROLES AGE VERSION
gke-chaos-... Ready,SchedulingDisabled <none> 13m v1.15.9-gke.22
You can see that the status of our single node is Ready,SchedulingDisabled
. We run the experiment that failed to drain a node; this is a two-step process. First, the system disables scheduling on that node so that no new Pods are deployed. Then, it drains that node by removing everything. The experiment managed to do the first step (it ...