Unable to Utilize GPU When Running Docker Image on AWS EC2 Instance

28 Views Asked by Sajad Malik At 21 March 2024 at 06:21

I have developed a Python application that utilizes GPU for computation, and I've created a Docker image to encapsulate this application. Locally, when I run the Docker image with Nvidia GPU support using the command:

docker run --rm --runtime=nvidia --gpus all image-name:tag

the application correctly utilizes the GPU for computation. Additionally, when I run the Python script directly without Docker like python test.py, it also runs on the GPU.

Ultralytics YOLOV8.1.28
Python-3.10.14 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4060 Laptop GPU, 7908MiB) Python-3.10.14 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4060 Laptop GPU, 7908MiB) Python-3.10.14 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4060 Laptop GPU, 7908MiB)
Model summary (fused): 168 layers, 3005843 parameters, 0 gradients, 8.1 GFLOPs Model summary (fused): 168 layers, 3005843 parameters, 0 gradients, 8.1 GFLOPs Model summary (fused): 168 layers, 3005843 parameters, 0 gradients, 8.1 GFLOPs

To achieve the above result withnin the docker, I had to install nvidia-container-toolkit on my ubuntu machine.

But, the GPU is not being consumed in two cases: 1: When I run the docker without the above gpu parameters using the simple command below:

docker run image-name:tag

2: when I deploy the Docker image to an AWS EC2 instance (specifically, a g4dn.xlarge instance with Nvidia T4 GPU), where I actually need it.

The results it shows:

Ultralytics YOLOV8.1.28
Python-3.10.14 torch-2.2.1+cu121 CPU (13th Gen Intel Core(TM) i7-13700HX) Python-3.10.14 torch-2.2.1+cu121 CPU (13th Gen Intel Core(TM) i7-13700HX) Python-3.10.14 torch-2.2.1+cu121 CPU (13th Gen Intel Core(TM) i7-13700HX)
Model summary (fused): 168 layers, 3005843 parameters, 0 gradients, 8.1 GFLOPs Model summary (fused): 168 layers, 3005843 parameters, 0 gradients, 8.1 GFLOPs Model summary (fused): 168 layers, 3005843 parameters, 0 gradients, 8.1 GFLOPS
Ultralytics YOLOV8.1.28 Ultralytics YOLOV8.1.28
Ultralytics YOLOV8.1.28
Python-3.10.14 torch-2.2.1+cu121 CPU (13th Gen Intel Core(TM) i7-13700HX) Python-3.10.14 torch-2.2.1+cu121 CPU (13th Gen Intel Core(TM) i7-13700HX) Python-3.10.14 torch-2.2.1+cu121 CPU (13th Gen Intel Core(TM) i7-13700HX)

Here's my Dockerfile:

FROM python:3.10-slim

# Install system dependencies required for ultralytics and OpenCV
RUN apt-get update && apt-get install -y \
    libgl1-mesa-glx \
    libglib2.0-0


WORKDIR /app


COPY requirements.txt test.py other_files ./

RUN pip install --no-cache-dir -r requirements.txt


CMD python test.py

On the AWS EC2 instance, I've verified that the GPU is accessible. However, even after deploying the Docker image to the instance and running it, the application continues to run on the CPU.

The AWS EC2 instance I'm using is g4dn.xlarge, which has the configuration:

4 vcpu 16 gb ram 16 gb graphics nvidia t4 tensor core

Here is the full description of the aws service used

Can someone please help me understand why the Docker image is not utilizing the GPU on the AWS EC2 instance and how I can resolve this issue? Thank you.

I have tried to change the docker file to below:

FROM python:3.10-slim

RUN apt-get update && apt-get install -y \
    curl \
    gnupg \
    lsb-release \
    wget \
    libgl1-mesa-glx \
    libglib2.0-0 \
    software-properties-common

RUN curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

RUN echo "deb http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu focal main" > /etc/apt/sources.list.d/graphics-drivers.list \
  && apt-key adv --keyserver keyserver.ubuntu.com --recv-keys FCAE110B1118213C

RUN apt-get update
RUN apt-get install -y nvidia-container-toolkit
RUN nvidia-ctk runtime configure --runtime=docker

WORKDIR /app

COPY requirements.txt test.py other_files ./

RUN pip install --no-cache-dir -r requirements.txt

CMD ["python", "test.py"]

But no luck so far.

Original Q&A

Unable to Utilize GPU When Running Docker Image on AWS EC2 Instance

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in AMAZON-WEB-SERVICES

Related Questions in DOCKER

Related Questions in MACHINE-LEARNING

Related Questions in NVIDIA-DOCKER

Trending Questions

Popular # Hahtags

Popular Questions