Monitoring and Alerting with Prometheus
In this lesson, we will discuss why we need a proper monitoring system to carry out chaos experiments.
We'll cover the following...
As I already mentioned, the critical ingredient that Chaos Toolkit does not provide is notifications whether a part of the system failed. Steady-state hypotheses are focused on what we know, and they are usually limited to a single application, network, storage, or node. By their nature, they are limited in their scope.
As you already know, we do need a proper monitoring system. We need to gather the metrics, and we are already doing that ...