The DevOps Toolkit: Kubernetes Chaos Engineering/

...

Pausing After Actions

In this lesson, we will find out why our experiment did not fail and see how we can pause after or before actions to give the system enough time to perform tasks.

We'll cover the following...

- Why didn’t the experiment fail?
- Inspecting the definition of terminate-pod-pause.yaml
- Running chaos experiment and inspecting the output
- Recreating the failed pods

Why didn’t the experiment fail?

In our previous experiment, we validated the state before and after actions. We checked whether the Pod exists, we terminated the Pod, and then we verified whether the Pod still exists. The experiment should have failed, but it didn’t. The reason why it didn’t fail is that all those probes and actions were executed immediately one after another.

When Chaos Toolkit sent an instruction to Kube API to destroy the Pod, it received an acknowledgment of that action. After that, it immediately validated whether the Pod was still there, and it was. Kubernetes did not have enough time to remove it entirely. Maybe the Pod was still running at that time, and perhaps, we were too fast. Or else, maybe the Pod was terminating. ...

Introduction To Kubernetes Chaos Engineering

Defining Requirements

Destroying Application Instances

Experimenting with Application Availability

Obstructing and Destroying Network

Draining and Deleting Nodes

Creating Chaos Experiment Reports

Running Chaos Experiments Inside a Kubernetes Cluster

Executing Random Chaos

What’s Next?

Pausing After Actions

Why didn’t the experiment fail?