Suppose I have 3 nodes and 4 replicas of a deployment in Kubernetes. My objective is to ensure that, whenever possible, two replicas under the same deployment should not be scheduled on the same node.
For example, when dealing with 2 replicas and 2 nodes, I can apply pod anti-affinity to prevent the assignment of both pods to a single node:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- frontend
topologyKey: kubernetes.io/hostname
However, when there are 4 replicas, the issue arises where 3 pods are successfully scheduled on 3 distinct nodes, but 1 pod remains in a pending state due to the inability to satisfy the pod anti-affinity rules on a single node. Is there a method to ensure that each replica is allocated to a separate node if available, falling back to scheduling on an existing node if necessary?