Horizontal Pod Autoscaler: Metrics-Based Scaling

by Alexis Kinsella·May 6, 2026·16 min read

You sized your Deployment for Friday's traffic spike, and now you're paying for those extra replicas the other six days. The HorizontalPodAutoscaler fixes that: it watches metrics, computes the replica count your workload actually needs, and adjusts the Deployment's scale in real time. No cron jobs, no manual intervention, no wasted capacity.

HPA is a core competency in the CKA Workloads & Scheduling domain (15% of the exam). You need both the imperative shortcut (kubectl autoscale) and the declarative autoscaling/v2 manifest. Beyond the exam, understanding the scaling algorithm, behavior configuration, and custom metrics pipeline is what separates "it autoscales" from "it autoscales predictably under load."

Sign in to access this lesson

Create a free account or sign in to enroll in the CKA — Certified Kubernetes Administrator course and access all 63 lessons.

CKA — Certified Kubernetes Administrator

63 lessons

Browse the full course curriculum →