K8s HPA 自动扩缩¶

一句话概述：HPA（Horizontal Pod Autoscaler）根据 CPU/内存使用率或自定义指标自动增减 Pod 数量，让应用在高峰期自动扩容、低谷期自动缩容，既保证性能又节省成本。

核心知识点¶

概念	白话解释
HPA	水平 Pod 自动扩缩器 = 根据指标自动调整 Pod 数量
VPA	垂直自动扩缩 = 自动调整 Pod 的 CPU/内存请求
Metrics Server	指标服务器 = 收集 Pod 的 CPU/内存数据
Target Utilization	目标利用率 = 你希望维持的资源使用比例
Scale-up	扩容 = 增加 Pod 数量
Scale-down	缩容 = 减少 Pod 数量
Stabilization Window	稳定窗口 = 防止频繁伸缩的等待时间

安装配置¶

# 安装 Metrics Server（HPA 的数据来源）
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

# minikube 环境
minikube addons enable metrics-server

# 验证 Metrics Server
kubectl top nodes  # 查看节点资源使用
kubectl top pods   # 查看 Pod 资源使用

基本使用¶

1. 基于 CPU 的 HPA¶

# 先有一个 Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 2  # 初始2个副本
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
        - name: web
          image: myapp:latest
          resources:
            requests:
              cpu: "200m"  # 必须设置 requests，HPA 才能计算利用率
              memory: "256Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
---
# HPA 配置
apiVersion: autoscaling/v2  # 使用 v2 API
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app  # 目标 Deployment
  minReplicas: 2   # 最少2个 Pod
  maxReplicas: 10  # 最多10个 Pod
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70  # CPU 平均使用率超过70%就扩容

# 命令行快速创建 HPA
kubectl autoscale deployment web-app --cpu-percent=70 --min=2 --max=10

# 查看 HPA 状态
kubectl get hpa  # 列出 HPA
kubectl describe hpa web-app-hpa  # 查看详情

2. 多指标 HPA¶

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: multi-metric-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70  # CPU 利用率目标70%

    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80  # 内存利用率目标80%

  behavior:  # 扩缩行为控制
    scaleUp:
      stabilizationWindowSeconds: 60  # 扩容前等60秒确认
      policies:
        - type: Percent
          value: 100  # 每次最多翻倍
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300  # 缩容前等5分钟（默认）
      policies:
        - type: Percent
          value: 10  # 每次最多缩10%
          periodSeconds: 60

3. 压力测试验证 HPA¶

# 给应用加压（在另一个终端）
kubectl run load-generator --rm -it --image=busybox -- /bin/sh
# 进入后执行：
while true; do wget -q -O- http://web-app-service; done

# 观察 HPA 扩容
kubectl get hpa -w  # 实时监听 HPA 状态变化
# TARGETS 列会显示当前利用率/目标利用率
# REPLICAS 列会看到 Pod 数量增加

高级用法¶

自定义指标 HPA¶

metrics:
  - type: Pods  # Pod 级别的自定义指标
    pods:
      metric:
        name: http_requests_per_second  # 自定义指标名
      target:
        type: AverageValue
        averageValue: "100"  # 每个 Pod 平均100 QPS

  - type: Object  # 外部对象指标
    object:
      describedObject:
        apiVersion: networking.k8s.io/v1
        kind: Ingress
        name: main-ingress
      metric:
        name: requests_per_second
      target:
        type: Value
        value: "2k"  # Ingress 总 QPS 达到 2000 时扩容

StatefulSet HPA¶

spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: StatefulSet  # HPA 也支持 StatefulSet
    name: mysql
  minReplicas: 1
  maxReplicas: 5
  # 注意：StatefulSet 扩缩容是有序的

常见报错¶

报错信息	原因	解决方法
`unable to fetch metrics`	Metrics Server 未安装	安装 Metrics Server
`missing request for cpu`	Pod 未设置 resources.requests	在 Deployment 中设置 cpu requests
`<unknown>/70%`	指标数据还没采集到	等几分钟让 Metrics Server 采集数据
频繁扩缩（抖动）	stabilizationWindow 太短	增加 behavior.scaleDown.stabilizationWindowSeconds
不缩容	缩容窗口期内	默认缩容等5分钟，耐心等

速查表¶

# === HPA 操作 ===
kubectl autoscale deploy <n> --cpu-percent=70 --min=2 --max=10  # 快速创建
kubectl get hpa                 # 列出 HPA
kubectl describe hpa <name>     # 查看详情
kubectl delete hpa <name>       # 删除 HPA
kubectl get hpa -w              # 实时监听

# === HPA 计算公式 ===
# 期望副本数 = ceil(当前副本数 × 当前指标值 / 目标指标值)
# 例：当前3个Pod，CPU利用率90%，目标70%
# 期望 = ceil(3 × 90/70) = ceil(3.86) = 4

# === 扩缩行为默认值 ===
# 扩容：15秒检查一次，无稳定窗口
# 缩容：15秒检查一次，5分钟稳定窗口
# 多指标时：取最大的期望副本数

# === 前提条件 ===
# 1. 安装 Metrics Server
# 2. Pod 必须设置 resources.requests
# 3. 自定义指标需要 Prometheus Adapter 等

参考：K8s HPA 文档 | 更新于 2026 年