Pausing After Actions
Explore how adding pauses after termination actions in Kubernetes chaos engineering experiments improves accuracy. This lesson helps you understand why immediate validations may yield misleading results and demonstrates how pausing for a set time allows the system to complete tasks before verification, resulting in more realistic and reliable experiment outcomes.
Why didn’t the experiment fail?
In our previous experiment, we validated the state before and after actions. We checked whether the Pod exists, we terminated the Pod, and then we verified whether the Pod still exists. The experiment should have failed, but it didn’t. The reason why it didn’t fail is that all those probes and actions were executed immediately one after another.
When Chaos Toolkit sent an instruction to Kube API to destroy the Pod, it received an acknowledgment of that action. After that, it immediately validated whether the Pod was still there, and it was. Kubernetes did not have enough time to remove it entirely. Maybe the Pod was still running at that time, and perhaps, we were too fast. Or else, maybe the Pod was terminating. ...