PyOpenCl code hanging on a simple get() - how can I troubleshoot?

61 Views Asked by At

Running some code from the literature (https://github.com/michaelmell/HiPFSTA), the basic idea is that you load images of a wobbly sphere in there and it will, for each frame, iteratively find the contours of the sphere. I've had to fix up some deprecated code but it mostly seems to be working.

When running the code, the following process occurs:

  • I load the image using PIL
  • The image is passed to the GPU, which, starting with frame 1, will go through x number of iterations for the contours (contour finding is done with a kernel that is complicated and not apparently the issue here)
  • It repeats this for each frame
  • It passes the info back to the CPU and saves it

The issue I'm running into is that it never completes this task. Each frame will take ~8-10s to complete, but then it will reach a hang after a few iterations; sometimes while analysing the first frame, sometimes it has made it as far as the tenth.

So far, I have thrown in some print statements to figure out where it's stalling. It never stops while executing the kernel, it will always stop at one particular line:

doubleVector = dev_doubleVector.get()

The program doesn't end, no error message pops up, it just prints the line beforehand and then... nothing.

This line is in a helpers.py file, which is, in brief, the following:

import pyopencl as cl
import pyopencl.array as cl_array
import numpy as np
import matplotlib.pyplot as plt

    def ToSingleVectorsOnHost(doubleVector):
        singleVectorX = np.copy(doubleVector['x'])
        singleVectorY = np.copy(doubleVector['y'])
        return singleVectorX, singleVectorY

    def ToSingleVectorsOnDevice(oclQueue,dev_doubleVector):
        doubleVector = dev_doubleVector.get()
        singleVectorX,singleVectorY = helpers.ToSingleVectorsOnHost(doubleVector)
        dev_singleVectorX = cl_array.to_device(oclQueue,singleVectorX)
        dev_singleVectorY = cl_array.to_device(oclQueue,singleVectorY)
        return dev_singleVectorX,dev_singleVectorY

My understanding is that this should be one of the simplest task the code performs: transferring a double vector from the GPU to the CPU. Leaving my computer for hours does not seem to lead to it continuing. So it goes from ~10s per frame to infinite.

It's possible that this is simply a case of my GPU reaching its limit. I've thrown in a barrier event beforehand: barrierEvent = cl.enqueue_barrier(self.queue)

...and even a wait, so the queue should be clear.

From starting the code all the way to the hang, my GPU usage is fairly consistently at 100%, with occasional dips--this is what makes me think that might be the issue. However, I use an Nvidia card, so I can't see any good way of monitoring OpenCL with it. Additionally, the hang always seems to be at this particular get() line, which seems odd--it's not the only get(), nor is it the only time data is transferred between the CPU and the GPU. And while it's always that get(), there's no telling whether it will happen on the first frame or the tenth, or the first iteration or the tenth.

Any advice you might have would be much appreciated!

EDIT: For more info as requested: GPU: NVIDIA Quadro P620 Driver: 31.0.15.3770, date 18/10/2023 OS: Windows 11 Pro for Workstations, version 23H2, build 22631.3296 CPU: Intel(R) Xeon(R) E-2124 CPU @ 3.30 GHz 3.31 GHz Installed RAM: 16.0 GB

0

There are 0 best solutions below