Problems deploying flink using flink kubernetes operator with high availability

429 Views Asked by At

I have a Flink operator running on kubernetes and im trying to deploy a Flink Deployment with high availability enabled. I configured the operator to enable leader selection, and enabled kubernetes high availability in the deployment chart. I run my flink deployment with one task manager and 3 job managers and I encounter odd behaviour. Sometimes I can connect to the Ui, but when I restart the leader job manager, sometimes a new one is elected properly and I can access the Ui, sometimes a new job manager takes over, but the UI claims a leader election is in progress(the task manager connects and works according to the logs) , and sometimes the task manager can't connect to a job manager with a

Could not resolve ResourceManager address... Could not connect to rpc endpoint...

I've yet to understand why any situation happens. If anyone experienced any of these situations, I would like some help and guidance.

I tried enabling the leader election and lease name, and changing resources for the task and job manager.

0

There are 0 best solutions below