...

/

Implementing Autoscaling for Kubernetes Services

Implementing Autoscaling for Kubernetes Services

Learn the approaches available to configure autoscaling for Kubernetes services.

Kubernetes itself is a really powerful orchestration platform that allows us to control how few or how many system resources a given bit of executable code can have access to. Since the implementation of Kubernetes can be done on-premises, in the cloud on virtual machines, or through a managed service, there are several options for autoscaling configuration. These can range from patterns in Kubernetes itself to certain features of cloud-managed services to third-party plugins that are purpose-built for specific scenarios.

Native Kubernetes options

As an orchestrator, Kubernetes offers a rich ecosystem that allows us to use as little or as much of the cluster’s compute power as needed, in a variety of ways. Some features allow us to control how applications can scale out, depending on some of the primary indicators we covered in the previous section. In this section, we’ll start with a couple of native options that can be used in our cluster, wherever it resides.

Horizontal Pod Autoscalers

Horizontal Pod Autoscalers (HPAs) are constructs within Kubernetes that allow us to specify a target condition or threshold that a workload must reach before the cluster will scale out and create a new pod or set of pods. The common parameters of minimum and maximum nodes are applicable here, giving us the ability to set up some guardrails around how much scaling occurs. The other parameter is the condition the deployment itself is experiencing—whether the event triggering the autoscaling is CPU or memory-bound.

While CPU and memory are the two out-of-the-box means for autoscaling, there are other ways we can add metrics to control autoscaling. The table below lists other common targets that can be used in tandem to determine how and when autoscaling should start.

Kubernetes Metric Types Related to HPAs

Metric Type

Description

Example

Resource

Metrics directly related to available pod container resources (CPU, memory).

- type: Resource

resource:

name: memory

target:

type: Utilization

averageUtilization: 70

Pods

Metrics related to pods within a deployment are generally measured as an average across pods (packets per second).

- type: Pods

Pods:

metric

name: packets-per-second

target:

type: AverageValue

averageValue: 2k

Object

Metrics related to other objects within the cluster that can impact performance (requests per second).

- type: Object

object:

metric:

name: requests-per-second

describeObject:

apiVersion: networking.k8s.io/v1

kind: Ingress

name: main-route

target:

type: Value

value: 8k

External

Metrics related to objects outside of the cluster. An example of this might be a hosted queueing service.

- type: External

external:

metric:

name: queue_messages_ready

selector:

matchLabels:

queue: "worker_tasks"

target:

type: AverageValue

averageValue: 20

The Kubernetes documentation ...