Lets say I have 3 instances of my service. I get a spike of traffic and now Kubernetes HPA says to scale up to 4. For a period of time, about 90 seconds, the new pod will consume about 400% CPU. Thus that will skew the averages and Kubernetes HPA will say scale from 4 to 5 and it keeps in this cycle till the averages are low enough to account for the new instance startup spike and it scales down. How do I tell the Kubernetes HPA to ignore CPU metrics for 90 seconds when it decides to scale up?
Current HPA Configuration:
kind: HorizontalPodAutoscaler
metadata:
name: my-app
namespace: my-namespace
labels:
app: my-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 3
maxReplicas: 15
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 95
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 95
I will also say I have dome some research and it appears that I can have something the below, but cannot find much good information on the stabilizationWindowSeconds and policies for the scaleUp Behavior.
behavior:
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max