Implementing Fault Injection and Chaos Tests

Learn what Chaos Mesh is for chaos engineering.

Suites such as Gremlin aim to make the adoption of chaos engineering easier to move toward because the Chaos Monkey suite popularized by Netflix can have an aggressive learning curve. For Kubernetes-based applications, a popular framework for chaos testing is Chaos Mesh. While it has not been around as long as suites such as Chaos Monkey, Chaos Mesh is a solid choice for implementing chaos experiments in a uniform way against a modular orchestration engine. Chaos Mesh is also an incubating project with the Cloud Native Computing Foundation (CNCF), which exposes it to a larger open-source community for use.

It’s interesting to note that Azure Chaos Studio, which we’ll be using to set up some baseline experiments, relies on Chaos Mesh behind the scenes to orchestrate and run experiments against Kubernetes-based targets. It also allows for agent-based installations on virtual machines, as well as virtual machine scale sets and other resources, giving us more control over where and how experiments can be run. We’ll start by diving into Chaos Mesh before moving on to orchestrating experiments using Azure Chaos Studio.

Starting with Chaos Mesh

First, we’ll look at installing and using Chaos Mesh to get a feel for how the experiments can be structured to impact services, namespaces, and even hosts within the Kubernetes cluster. It can be installed in any valid Kubernetes environment, from minikube to Docker Desktop to cloud-hosted Kubernetes clusters. Each experiment is created as a Custom Resource Definition (CRD) within the cluster, which allows specific .yaml files of every experiment we want to run to be captured; they can then be applied during a pipeline or test run.

If we happen to be using a Linux distribution, Windows Subsystem for Linux, or macOS, the one-liner provided on the Chaos Mesh home page can get us started with the installation process.

Get hands-on with 1400+ tech skills courses.