What Is Kubernetes HPA and How Can It Help You Save on the Cloud?

Autoscaling is a core capability of Kubernetes. The tighter you configure the scaling mechanisms – HPA, VPA, and Cluster Autoscaler – the lower the waste and costs of running your application. 

Kubernetes comes with three types of autoscaling mechanisms: Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), and Cluster Autoscaler. Each of these adds a unique ingredient to your overarching goal of autoscaling for cloud cost optimization

Autoscaling an Amazon Elastic Kubernetes Service cluster

In this article we are going to consider the two most common methods for Autoscaling in EKS cluster:

  • Horizontal Pod Autoscaler (HPA)
  • Cluster Autoscaler (CA)

The Horizontal Pod Autoscaler or HPA is a Kubernetes component that automatically scales your service based on metrics such as CPU utilization or others, as defined through the Kubernetes metric server. The HPA scales the pods in either a deployment or replica set, and is implemented as a Kubernetes API resource and a controller. The Controller Manager queries the resource utilization against the metrics specified in each horizontal pod autoscaler definition. It obtains the metrics from either the resource metrics API for per pod metrics or the custom metrics API for any other metrics.