...

/

Deleting Worker Nodes

Deleting Worker Nodes

In this lesson, we will carry out an experiment that will delete a node in the cluster. This experiment can help us understand how our cluster behaves if nodes are destroyed or damaged.

After resolving a few problems, we are now able to drain nodes. We discovered those issues through experiments. As a result, we should be able to upgrade our cluster without doing something terribly wrong and, hopefully, without negatively affecting our applications.

Draining nodes is, most of the time, a voluntary action. We tend to drain our nodes when we choose to upgrade our cluster. The previous experiment was beneficial because we now have the confidence to upgrade the cluster without (much) fear. However, there is still something worse that can happen to our nodes.

More often than not, nodes will fail without our consent. They will not drain. They will get destroyed or damaged, they will go down, and they will be powered off. Bad things will happen to nodes, whether we like it or not.

Let’s see whether we can create an experiment that will validate how our cluster behaves when such things happen.

Inspecting the definition of node-delete.yaml and comparing it with node-uncordon.yaml

As always, we’re going to take a look at yet another experiment.

Press + to interact
cat chaos/node-delete.yaml

The output is as follows.

version: 1.0.0
title: What happens if we delete a node
description: All the instances are distributed among healthy nodes and the applications are healthy
tags:
- k8s
- deployment
- node
configuration:
  node_label:
      type: env
      key: NODE_LABEL
steady-state-hypothesis:
  title: Nodes are indestructible
  probes:
  - name: all-apps-are-healthy
    type: probe
    tolerance: true
    provider:
      type: python
      func: all_microservices_healthy
      module: chaosk8s.probes
      arguments:
        ns: go-demo-8
method:
- type: action
  name: delete-node
  provider:
    type: python
    func: delete_nodes
    module: chaosk8s.node.actions
    arguments:
      label_selector: ${node_label}
      count: 1
      pod_namespace: go-demo-8
  pauses: 
    after: 10

We can see that we replaced ...

Access this course and 1400+ top-rated courses and projects.