...

/

Validating Application Availability

Validating Application Availability

In this lesson, we will be running some chaos experiments to check if our application is highly available and remains accessible from outside.

Validating whether all the Pods are healthy and running is useful. But that does not necessarily mean that our application is accessible. Maybe the Pods are running, and everything is fantastic and peachy, but our customers cannot access our application.

Let’s see how we can validate whether we can send HTTP requests to our application and whether we can continue doing that after an instance of our app is destroyed. How would that definition look like?

Before we dive into application availability, we have a tiny problem that needs to be addressed. I couldn’t define the address of our application in YAML because your IP is almost certainly different than mine. And neither of us are using real domains because that would be too complicated to set up. That problem allows me to introduce you to yet another feature of Chaos Toolkit.

We are going to define a variable that can be injected into our definition.

Inspecting the definition of health-http.yaml

Let’s take a quick look at yet another YAML.

Press + to interact
cat chaos/health-http.yaml

The output is as follows.

version: 1.0.0
title: What happens if we terminate an instance of the application?
description: If an instance of the application is terminated, the applications as a whole should still be operational.
tags:
- k8s
- pod
- http
configuration:
  ingress_host:
      type: env
      key: INGRESS_HOST
steady-state-hypothesis:
  title: The app is healthy
  probes:
  - name: app-responds-to-requests
    type: probe
    tolerance: 200
    provider:
      type: http
      timeout: 3
      verify_tls: false
      url: http://${ingress_host}/demo/person
      headers:
        Host: go-demo-8.acme.com
method:
- type: action
  name: terminate-app-pod
  provider:
    type: python
    module: chaosk8s.pod.actions
    func: terminate_pods
    arguments:
      label_selector: app=go-demo-8
      rand: true
      ns: go-demo-8
  pauses: 
    after: 2

So the new sections of this definition, when compared to the previous one, is that we added a configuration section and we changed our steady-state-hypothesis. There are a few other changes. Some of those are cosmetic, while others are indeed important.

Checking the difference between of health-pause.yaml and health-http.yaml

We’ll skip commenting on the contents of this file because it is hard to see what’s different when compared to what we had before. We’ll comment on the differences by executing a diff between the new and the old version of the definition.

Press + to interact
diff chaos/health-pause.yaml chaos/health-http.yaml

The output is as follows.

3c3
< description: If an instance of
...