Highlight from What You Need to Know About Kubernetes Autoscaling

Because HPA relies on metrics such as CPU and memory usage to determine when to scale pods, there may be a delay between the time demand increases and the time additional pods are up to meet it. This delay can potentially slow down response times, temporarily reducing performance for end users.