Kubernetes container keeps crashing

355 Views Asked by At

I am having an issue where a custom built rabbitmq image works in docker, but continuously restarts within kubernetes.

Dockerfile:

# syntax=docker/dockerfile:1
FROM rabbitmq:management-alpine
ADD rabbitmq.conf /etc/rabbitmq/
ADD definitions.json /etc/rabbitmq/

ENTRYPOINT ["docker-entrypoint.sh"]
EXPOSE 4369 5671 5672 15691 15692 25672
CMD ["rabbitmq-server"]

When run with a simple docker run <IMAGE>, I get logs indicating success, and clearly the service is running in the container:

...
2022-11-25 16:37:41.392367+00:00 [info] <0.229.0> Importing concurrently 7 exchanges...
2022-11-25 16:37:41.394591+00:00 [info] <0.229.0> Importing sequentially 1 global runtime parameters...
2022-11-25 16:37:41.395691+00:00 [info] <0.229.0> Importing concurrently 7 queues...
2022-11-25 16:37:41.400586+00:00 [info] <0.229.0> Importing concurrently 7 bindings...
2022-11-25 16:37:41.403519+00:00 [info] <0.787.0> Resetting node maintenance status
2022-11-25 16:37:41.414900+00:00 [info] <0.846.0> Management plugin: HTTP (non-TLS) listener started on port 15672
2022-11-25 16:37:41.414963+00:00 [info] <0.874.0> Statistics database started.
2022-11-25 16:37:41.415003+00:00 [info] <0.873.0> Starting worker pool 'management_worker_pool' with 3 processes in it
2022-11-25 16:37:41.423652+00:00 [info] <0.888.0> Prometheus metrics: HTTP (non-TLS) listener started on port 15692
2022-11-25 16:37:41.423704+00:00 [info] <0.787.0> Ready to start client connection listeners
2022-11-25 16:37:41.424455+00:00 [info] <0.932.0> started TCP listener on [::]:5672
 completed with 4 plugins.
2022-11-25 16:37:41.448054+00:00 [info] <0.787.0> Server startup complete; 4 plugins started.
2022-11-25 16:37:41.448054+00:00 [info] <0.787.0>  * rabbitmq_prometheus
2022-11-25 16:37:41.448054+00:00 [info] <0.787.0>  * rabbitmq_management
2022-11-25 16:37:41.448054+00:00 [info] <0.787.0>  * rabbitmq_web_dispatch
2022-11-25 16:37:41.448054+00:00 [info] <0.787.0>  * rabbitmq_management_agent

However, if I take this container, and deploy it within my kubernetes cluster, the pod seems to start, and then exit into a "CrashLoopBackoff" state.

kubectl logs <POD> returns:

Segmentation fault (core dumped)

and kubectl describe pod <POD> returns:

Name:             rabbitmq-0
Namespace:        *****
Priority:         0
Service Account:  *****
Node:             minikube/*****
Start Time:       Thu, 24 Nov 2022 00:35:28 -0500
Labels:           app=rabbitmq
                  controller-revision-hash=rabbitmq-75d6d74c5d
                  statefulset.kubernetes.io/pod-name=rabbitmq-0
Annotations:      <none>
Status:           Running
IP:               *****
IPs:
  IP:           *****
Controlled By:  StatefulSet/rabbitmq
Containers:
  rabbitmq-deployment:
    Container ID:   docker://32930809a10ced998083d8adacec209da7081b7c7bfda605f7ac87f78cf23fda
    Image:          *****/<POD>:latest
    Image ID:       *****
    Ports:          5672/TCP, 15672/TCP, 15692/TCP, 4369/TCP
    Host Ports:     0/TCP, 0/TCP, 0/TCP, 0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 24 Nov 2022 00:41:26 -0500
      Finished:     Thu, 24 Nov 2022 00:41:27 -0500
    Ready:          False
    Restart Count:  6
    Liveness:       exec [rabbitmq-diagnostics status] delay=60s timeout=15s period=60s #success=1 #failure=3
    Readiness:      exec [rabbitmq-diagnostics ping] delay=20s timeout=10s period=60s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-sst9x (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-sst9x:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                 From               Message
  ----     ------     ----                ----               -------
  Normal   Scheduled  35h                 default-scheduler  Successfully assigned mpa/rabbitmq-0 to minikube
  Normal   Pulled     35h                 kubelet            Successfully pulled image "*****" in 622.632929ms
  Normal   Pulled     35h                 kubelet            Successfully pulled image "*****" in 233.765678ms
  Normal   Pulled     35h                 kubelet            Successfully pulled image "*****" in 203.932962ms
  Normal   Pulling    35h (x4 over 35h)   kubelet            Pulling image "*****"
  Normal   Created    35h (x4 over 35h)   kubelet            Created container rabbitmq-deployment
  Normal   Started    35h (x4 over 35h)   kubelet            Started container rabbitmq-deployment
  Normal   Pulled     35h                 kubelet            Successfully pulled image "*****" in 212.459802ms
  Warning  BackOff    35h (x52 over 35h)  kubelet            Back-off restarting failed container

The section of that describe command that states:

    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed

makes me wonder if the process isn't properly being left running. It's almost as if rabbitmq is starting, and then exiting once initialized.

Is there something I am missing here? Thank you.

EDIT: kubectl get all gives:

NAME                                    READY   STATUS             RESTARTS        AGE
pod/auth-deployment-9cfd4c64f-c5v99     1/1     Running            0               19m
pod/config-deployment-d4f4c959c-dnspd   1/1     Running            0               20m
pod/rabbitmq-0                          0/1     CrashLoopBackOff   8 (4m45s ago)   20m

NAME                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/auth-service     ClusterIP   10.101.181.223   <none>        8080/TCP   19m
service/config-service   ClusterIP   10.98.208.163    <none>        8080/TCP   20m

NAME                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/auth-deployment     1/1     1            1           19m
deployment.apps/config-deployment   1/1     1            1           20m

NAME                                          DESIRED   CURRENT   READY   AGE
replicaset.apps/auth-deployment-9cfd4c64f     1         1         1       19m
replicaset.apps/config-deployment-d4f4c959c   1         1         1       20m

NAME                        READY   AGE
statefulset.apps/rabbitmq   0/1     20m
0

There are 0 best solutions below