Adding GPU to Docker on Rocky Linux platform

93 Views Asked by At

I’m going to deploy an “NVIDIA GeForce GTX 1050 Ti” graphics card to docker containers. According to the links below, I installed the driver for the graphic card and Cuda, as well as the toolkit for Docker in Rocky Linux.

https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

  • The relevant drivers were installed according to the following path and also the nouveau module is not loaded and the nvidia module is loaded. enter image description here

But when a docker container is up and I run the nvidia-smi command on Rocky, it shows as follows that it did not find any processes:

Sun Feb  4 15:50:21 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1050 Ti     On  | 00000000:04:00.0 Off |                  N/A |
|  0%   42C    P8              N/A /  75W |      1MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
  1. I started the respective container several times with the following parameters separately, but it didn’t make a difference.

    --pid=host --privileged --gpus 'all,capabilities=utility' --runtime=nvidia

  2. The output of the strace nvidia-smi command is as follows: https://forums.developer.nvidia.com/uploads/short-url/2MTA1aRNltMyhrtvc7ERDjUs3Wc.txt

my questions:

  1. I don’t know why the nvidia-smi command doesn’t show any processes and whether the graphics card is applied to the docker container or not? :(

  2. My next question is that this nvidia-smi command should be executed in the Rocky system or in the container environment as follows:

    docker exec -it af868d81d6f4 nvidia-smi

0

There are 0 best solutions below