We have deployed backstage application on AKS cluster and we used nginx ingress controller for our application, but the application is not stable, it works for 2 minutes, and goes down for long time (working intermittently). We are not able to find the root cause, we can't even see any error in pod logs. The same application is working in another environment without any issues.
We see http 502 error intermittently.
We tried to redeploy nginx as well as the backstage application, but it did not help.
Intermittent issues like the one you're experiencing with your Backstage application on an AKS cluster can be challenging to diagnose,especially when there are no clear error messages in the pod logs.This is more of a troubleshooting question. Howevever, the HTTP 502 error suggests a problem with the gateway or proxy (in this case, Nginx), which is failing to get a valid response from your Backstage application.
Below is an example to deploy backstage on AKS using helm chart
If you want to take the manual deployment route, you can do that as well.
For production deployments, the
imagereference will usually be a full URL to a repository on the container registry for example: arkocr.azurecr.io/backstage Now if you want to check it on localhost:But for this example sake I will be using helm
then create an Ingress resource that will route traffic to your Backstage service. Save it with
kubectl apply -f yourfilenameAfter the Nginx Ingress Controller is successfully installed, it will be assigned an external IP, your pods backstage pods are up and you can access it on your browser

kubectl get svc -n backstageNow coming to your intermittent issue, if you are running this on AKS, ensure that the necessary firewall rules are in place to allow traffic to the Ingress Controller and make sure to check mark below
By following these steps, you should be able to ensure that the necessary firewall rules are in place for your AKS cluster, allowing external traffic to reach your Nginx Ingress Controller. If the issue persists you need to check logs
Inspect the resource usage (CPU, memory) of the Backstage pods to see if there's resource exhaustion which could cause the pod to become unresponsive.
Reference Document:
Official Backstage guide
Backstage on K8s example
Similar thread