Canary Deployments with Flagger

Learn how to perform a canary deployment with Flagger.

Performing a canary deployment with Flagger

Canary deployments are a progressive delivery strategy supported by Flagger that allows the gradual rollout of a new application version, referred to as a canary. When a canary deployment is performed, Flagger will route an initial percentage of traffic to the canary and monitor its performance through metrics collected by Prometheus.

If the metrics appear healthy, Flagger will increase the traffic diverted to the canary instead of the primary application while continuing its monitoring. The traffic increase and monitoring will continue until the canary reaches the configured maximum percentage of traffic it should receive. At this point, the canary will be promoted and replace the current version of the application, becoming the primary application version that receives all of the incoming traffic.

Creating a canary deployment with Flagger

When using Flagger, progressive delivery strategies such as canary deployments are created as Kubernetes resources on the cluster. To support the canary deployment, the following resources must be created on the cluster:

  • Horizontal pod autoscaler

  • Ingress

  • Canary

To create these resources, we'll describe their specifications declaratively using YAML similarly to how we configured other Kubernetes resources.

Horizontal pod autoscaler

A horizontal pod autoscaler is used to increase the number of replicas of a pod as the demand on the resource grows. As the pod receives more traffic and its resource usage grows, the horizontal pod autoscaler will automatically provision new pods to meet the increased demand placed on the resource by the traffic. Once the demand subsides, the horizontal pod autoscaler will reduce the number of pod replicas, scaling down the unnecessary resources. Here's an example of the declarative specification for a horizontal pod autoscaler:

Press + to interact
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: python-sample-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: python-sample-deployment
minReplicas: 2
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 99

Let's break down the specification to better understand how it scales pods.

Required Fields

  • Lines 1—4: This section of the specification should seem familiar since it is required for every Kubernetes resource. It defines the apiVersion, kind, and metadata for the resource we'll create on the cluster.

Specification

  • Lines 6—9: The scaleTargetRef object refers to the deployment that contains the pods, which run the application version within a container. This is the resource the horizontal pod autoscaler will scale as traffic on the resource increases.

  • Line 10: The minReplicas field determines the minimum number of pods that should be running on the cluster for the deployment.

  • Line 11: The maxReplicas field determines the maximum number of pods that should be running on the cluster for the deployment.

  • Line 12: The metrics field contains an array of metrics that are monitored and used to determine when the resource should be scaled. In this instance, we've configured the horizontal pod autoscaler to scale the resource up or down based on ...