Request a solution for pod zero-downtime with single replia when node auto scaling

41 Views Asked by At

We are encountering 502 response issues during node autoscaling using cluster autoscaler in our EKS cluster, since some pods with a single replica are out of service for a couple of minutes when pod migration from one node to another is happening.

We have observed that it is working fine when working on pod deployment. But we noticed that the pod will transition to “Terminating” status even when the pod is in “ContainerCreating” state and not a “Ready” state when the pod is running on a single replica.

We are already aware that this can be rectified by deploying multiple replicas, but we want a solution for single replica.

Looking for a solution to ensure the old pod is not terminated until a new pod is ready during node scaling.

1

There are 1 best solutions below

3
abinet On

Setting maxUNavailable: 0 helps you to avoid downtime. With this Kubernetes will wait until new pod is Ready before terminating the old ones:

spec:
  revisionHistoryLimit: 1
  replicas: 1
  strategy:
  type: RollingUpdate
  rollingUpdate:
  maxUnavailable: 0
  maxSurge: 100%
template:
  spec:
    terminationGracePeriodSeconds: 400

https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-unavailable