Redis-Cluster in Kubernetes (GKE): Readiness probe failed after pod restart. Pod remain in state "running"

89 Views Asked by At

in our company we are using redis-cluster in GKE and faced the problem that the redis pods are no more reachable after a restart. Our GKE Cluster implement a mechanism to randomly restart pods in order to check the resilience of all software running in the GKE.

The redis pods remain in the state running although the restart appear to be succeeded as shown in the pod log below. GKE just saying by an event that the Readiness probe failed: cluster_state:fail.

We are using:

  • the helm chart redis-cluster 8.6.11
  • image bitnami/redis-cluster:7.2.4-debian-11-r0
  • standard settings for the liveness and readiness probes

Does anyone have any idea what goes wrong here? Should we adapt the probes anyhow ? Thanks in advance for any hint!

POD Log after restart:

redis-cluster 09:17:25.32 INFO  ==\>
redis-cluster 09:17:25.33 INFO  ==\> Welcome to the Bitnami redis-cluster container
redis-cluster 09:17:25.33 INFO  ==\>
redis-cluster 09:17:25.33 INFO  ==\> \*\* Starting Redis setup \*\*
redis-cluster 09:17:25.51 INFO  ==\> Initializing Redis
redis-cluster 09:17:25.53 INFO  ==\> Setting Redis config file
Storing map with hostnames and IPs
redis-cluster 09:17:31.25 INFO  ==\> \*\* Redis setup finished! \*\*

WARNING: Changing databases number from 16 to 1 since we are in cluster mode
oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 15 Feb 2024 09:17:31.423 \* Redis version=7.2.4, bits=64, commit=00000000, modified=0, pid=1, just started
Configuration loaded
1:M 15 Feb 2024 09:17:31.423 \* monotonic clock: POSIX clock_gettime
_._

      _.-``    `.  `_.  ''-._           Redis 7.2.4 (00000000/0) 64 bit

.-\`\` .-```.  ```/    _.,_ ''-.\_  
(    '      ,       .-` |`,    )     Running in cluster mode
.-'|     Port: 6379
|    `-._   `.\_    /     _.-'    |     PID: 1
`-._    `-._  `-./  _.-'    _.-'                                     |`-._`-._    `-.\_\_.-'    .-'_.-'|  
|    `-._`-.        _.-'_.-'    |           https://redis.io  
`-._    `-._`-.__.-'_.-'    _.-'                                     |`-._`-._    `-.__.-'    _.-'_.-'|  
|    `-._`-.\_        _.-'_.-'    |  
`-._    `-._`-.__.-'_.-'    _.-'                                          `-._    `-.__.-'    _.-'                                                  `-.\_        \_.-'  
\`-.__.-'

1:M 15 Feb 2024 09:17:31.424 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 15 Feb 2024 09:17:31.424 \* No cluster configuration found, I'm f9b36a66aac595e0d54f6d075bc1fc7dc3b9b99c
1:M 15 Feb 2024 09:17:31.428 \* Server initialized
1:M 15 Feb 2024 09:17:31.430 \* Creating AOF base file appendonly.aof.1.base.rdb on server start
1:M 15 Feb 2024 09:17:31.433 \* Creating AOF incr file appendonly.aof.1.incr.aof on server start
**Ready to accept connections tcp**

As pods in Kubernetes may be restarted due to a couple of reasons we just expect that a restarted pod in the redis cluster should be reachable.

0

There are 0 best solutions below