Horizontal Pod autoscaling – Managing Advanced Kubernetes Resources-2 – Managing Advanced Kubernetes Resources and Cloud Exams

Now, let’s autoscale this deployment. The Deployment resource needs at least 1 pod replica and can have a maximum of 5 pod replicas while maintaining an average CPU utilization of 25%. Use the following command to create a HorizontalPodAutoscaler resource:

$ kubectl autoscale deployment nginx –cpu-percent=25 –min=1 –max=5

Now that we have the HorizontalPodAutoscaler resource created, we can load test the application using the hey load testing utility preinstalled in Google Cloud Shell. But before you fire the load test, open a duplicate shell session and watch the Deployment resource using the following command:

$ kubectl get deployment nginx -w

Open another duplicate shell session and watch the HorizontalPodAutoscaler resource using the following command:

$ kubectl get hpa nginx -w

Now, in the original window, run the following command to fire a load test:

$ hey -z 120s -c 100 http://34.123.234.57

It will start a load test for 2 minutes, with 10 concurrent users continuously hammering the Service. You will see the following output if you open the window where you’re watching the HorizontalPodAutoscaler resource. As soon as we start firing the load test, the average utilization reaches 46%. The HorizontalPodAutoscaler resource waits for some time, then it increases the replicas, first to 2, then to 4, and finally to 5. When the test is complete, the utilization drops quickly to 27%, 25%, and finally, 0%. When the utilization goes to 0%, the HorizontalPodAutoscaler resource spins down the replicas from 5 to 1 gradually:

$ kubectl get hpa nginx -w
NAME	REFERENCE	TARGETS	MINPODS	MAXPODS	REPLICAS	AGE
nginx	deployment/nginx	<unknown>/25%	1	5	1	32s
nginx	deployment/nginx	46%/25%	1	5	1	71s
nginx	deployment/nginx	46%/25%	1	5	2	92s
nginx	deployment/nginx	92%/25%	1	5	4	2m2s
nginx	deployment/nginx	66%/25%	1	5	5	2m32s
nginx	deployment/nginx	57%/25%	1	5	5	2m41s
nginx	deployment/nginx	27%/25%	1	5	5	3m11s
nginx	deployment/nginx	23%/25%	1	5	5	3m41s
nginx	deployment/nginx	0%/25%	1	5	4	4m23s
nginx	deployment/nginx	0%/25%	1	5	2	5m53s
nginx	deployment/nginx	0%/25%	1	5	1	6m30s

Likewise, we will see the replicas of the Deployment changing when the HorizontalPodAutoscaler resource actions the changes:

$ kubectl get deployment nginx -w
NAME	READY	UP-TO-DATE	AVAILABLE	AGE
nginx	1/1	1	1	18s
nginx	1/2	1	1	77s
nginx	2/2	2	2	79s
nginx	2/4	2	2	107s
nginx	3/4	4	3	108s
nginx	4/4	4	4	109s
nginx	4/5	4	4	2m17s
nginx	5/5	5	5	2m19s
nginx	4/4	4	4	4m23s
nginx	2/2	2	2	5m53s
nginx	1/1	1	1	6m30s

Besides CPU and memory, you can use other parameters to scale your workloads, such as network

traffic. You can also use external metrics such as latency and other factors that you can use to decide when to scale your traffic.

Tip

While you should use the HorizontalPodAutoscaler resource with CPU and memory, you should also consider scaling on external metrics such as response time and network latency. That will ensure better reliability as they directly impact customer experience and are crucial to your business.

Till now, we have been dealing with stateless workloads. However, pragmatically speaking, some applications need to save the state. Let’s look at some considerations for managing stateful applications.

Related Posts

Static provisioning – Managing Advanced Kubernetes Resources-2

Static provisioning – Managing Advanced Kubernetes Resources-1

StatefulSet resources – Managing Advanced Kubernetes Resources

Managing stateful applications – Managing Advanced Kubernetes Resources

Horizontal Pod autoscaling – Managing Advanced Kubernetes Resources-1

Name-based routing – Managing Advanced Kubernetes Resources

Leave a Reply Cancel reply