I'm currently utilizing GPU passthrough in a Docker container by running the following commands:
sudo docker run -d \
--name=jellyfin \
--runtime=nvidia \
--gpus all \
-e NVIDIA_DRIVER_CAPABILITIES=all \
-e NVIDIA_VISIBLE_DEVICES=all \
jellyfin/jellyfin:latest
Initially, the setup works fine, as shown in this screenshot, with the Docker container able to access the GPU without issues.
However, after a random amount of time (ranging from a day to a week), the GPU passthrough randomly fails, resulting in the container being unable to utilize the GPU, as depicted in this screenshot. (Restarting the docker container will then fix it again)
If anyone has any ideas or can help me out with this id appreciate it.
Environment Details: OS - Linux Manjaro Docker Version - 25.0.4, build 1a576c5
I also looked in the docker logs, and other than giving errors that it cant access the gpu, there was nothing of note
[AVHWDeviceContext @ 0x55b4273e0440] cu->cuInit(0) failed -> CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
Device creation failed: -542398533.
Failed to set value 'cuda=cu:0' for option 'init_hw_device': Generic error in an external library
Error parsing global options: Generic error in an external library

