Scale up the Cluster
This lesson focuses on how to scale up the cluster and the rules which govern it.
Scale up the nodes #
The objective is to scale the nodes of our cluster to meet the demand of our Pods. We want not only to increase the number of worker nodes when we need additional capacity, but also to remove them when they are underused. For now, we’ll focus on the former, and explore the latter afterward.
Let’s start by taking a look at how many nodes we have in the cluster.
kubectl get nodes
The output, from GKE, is as follows.
NAME STATUS ROLES AGE VERSION
gke-devops25-... Ready <none> 5m27s v1.9.7-gke.6
gke-devops25-... Ready <none> 5m28s v1.9.7-gke.6
gke-devops25-... Ready <none> 5m24s v1.9.7-gke.6
In your case, the number of nodes might differ. That’s not important. What matters is to remember how many you have right now since that number will change soon.
Let’s take a look at the definition of the go-demo-5
application before we roll it out.
cat scaling/go-demo-5-many.yml
The output, limited to the relevant parts, is as follows.
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
namespace: go-demo-5
spec:
...
template:
...
spec:
containers:
- name: api
...
resources:
limits:
memory: 1Gi
cpu: 0.1
requests:
memory: 500Mi
cpu: 0.01
...
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: api
namespace: go-demo-5
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 15
maxReplicas: 30
...
In this context, the only important part of the definition we are about to apply is the HPA
connected to the api
Deployment. Its minimum number of replicas is 15
. Given that each api
container requests 500MB RAM, fifteen replicas (7.5GB RAM) should be more than our cluster can sustain, assuming that it was created using one of the Gists. Otherwise, you might need to increase the minimum number of replicas.
Let’s apply the definition and take a look at the HPAs
.
kubectl apply \
-f scaling/go-demo-5-many.yml \
--record
kubectl -n go-demo-5 get hpa
The output of the latter command is as follows.
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
api Deployment/api <unknown>/80%, <unknown>/80% 15 30 1 38s
db StatefulSet/db <unknown>/80%, <unknown>/80% 3 5 1 40s
Not enough resources to host all pods #
It doesn’t matter if the targets are still unknown
. They will be calculated soon, but we do not care for them right now. What matters is that the api
HPA
will scale the Deployment to at least 15
replicas.
Next, we need to wait for a few seconds before we take a look at the Pods in the go-demo-5
Namespace.
kubectl -n go-demo-5 get pods
The output is as follows.
NAME READY STATUS RESTARTS AGE
api-... 0/1 ContainerCreating 0 2s
api-... 0/1 Pending 0 2s
api-... 0/1 Pending 0 2s
api-... 0/1 ContainerCreating 0 2s
api-... 0/1 ContainerCreating 0 2s
api-... 0/1 ContainerCreating 1 32s
api-... 0/1 Pending 0 2s
api-... 0/1 ContainerCreating 0 2s
api-... 0/1 ContainerCreating 0 2s
api-... 0/1 ContainerCreating 0 2s
api-... 0/1 ContainerCreating 0 2s
api-... 0/1 ContainerCreating 0 2s
api-... 0/1 Pending 0 2s
api-... 0/1 ContainerCreating 0 2s
api-... 0/1 ContainerCreating 0 2s
db-0 2/2 Running 0 34s
db-1 0/2 ContainerCreating 0 34s
We can see that some of the api
Pods are being created, while others are pending. There can be quite a few reasons why a Pod would enter into the pending state. In our case, there are not enough available resources to host all the Pods.
Get hands-on with 1300+ tech skills courses.