CUDAfy.Net / OpenCL, struct containing byte array results in non-blittable exception

1.6k Views Asked by At

Ok, so I'm using CUDAfy.Net, and I have the following 3 structs:

[Cudafy]
public struct Collider
{
    public int Index;
    public int Type;
    public Sphere Sphere;
    public Plane Plane;
    public Material Material;
}

[Cudafy]
public struct Material
{
    public Color Color;
    public Texture Texture;
    public float Shininess;
}

[Cudafy]
public struct Texture
{
    public int Width, Height;
    public byte[ ] Data;
}

Now, as soon as I send over an array of Collider objects to the GPU, using

CopyToDevice<GPU.Collider>( ColliderArray );

I get the following error:

An unhandled exception of type 'System.ArgumentException' occurred in mscorlib.dll
Additional information: Object contains non-primitive or non-blittable data.

Does anyone with any experience with either CUDAfy.Net, or OpenCL ( since it basically compiles down into OpenCL ), have any idea how I could accomplish this? The whole problem lies in the byte array of Texture, since everything worked just fine when I didn't have a Texture struct and the array is the non-blittable part as far as I know. I had found several questions regarding the same problem, and they fixed it using fixed-size arrays. However, I am unable to do this as these are textures, which can have greatly varying sizes.

EDIT: Right now, I'm doing the following on the CPU:

    public unsafe static GPU.Texture CreateGPUTexture( Cudafy.Host.GPGPU _GPU, System.Drawing.Bitmap Image )
    {
        GPU.Texture T = new GPU.Texture( );
        T.Width = Image.Width;
        T.Height = Image.Height;
        byte[ ] Data = new byte[ Image.Width * Image.Height * 3 ];


        for ( int X = 0; X < Image.Width; X++ )
            for ( int Y = 0; Y < Image.Height; Y++ )
            {
                System.Drawing.Color C = Image.GetPixel( X, Y );
                int ID = ( X + Y * Image.Width ) * 3;
                Data[ ID ] = C.R;
                Data[ ID + 1 ] = C.G;
                Data[ ID + 2 ] = C.B;
            }

        byte[ ] _Data = _GPU.CopyToDevice<byte>( Data );
        IntPtr Pointer = _GPU.GetDeviceMemory( _Data ).Pointer;
        T.Data = ( byte* )Pointer.ToPointer( );

        return T;
    }

I then attach this Texture struct to the colliders, and send them to the GPU. This all goes without any errors. However, as soon as I try to USE a texture on the GPU, like this:

    [Cudafy]
    public static Color GetTextureColor( int X, int Y, Texture Tex )
    {
        int ID = ( X + Y * Tex.Width ) * 3;
        unsafe
        {
            byte R = Tex.Data[ ID ];
            byte G = Tex.Data[ ID + 1 ];
            byte B = Tex.Data[ ID + 2 ];

            return CreateColor( ( float )R / 255f, ( float )G / 255f, ( float )B / 255f );
        }
    }

I get the following error:

An unhandled exception of type 'Cloo.InvalidCommandQueueComputeException' occurred in Cudafy.NET.dll
Additional information: OpenCL error code detected: InvalidCommandQueue.

The Texture struct looks like this, by the way:

    [Cudafy]
    public unsafe struct Texture
    {
        public int Width, Height;
        public byte* Data;
    }

I'm completely at a loss again..

1

There are 1 best solutions below

1
On

Cudafy does not support arrays yet. So you can't use "public byte[] Data" neither in structures nor kernels itself. you could try it less object oriented. I mean try to remove data array from structre itself and copy them separately. e.g. copyToDevice("texture properties") and then copy appropriate data array copyToDevice("texture data")

EDIT: OK I found a solution but it is not pretty code.

As you get the Pointer of your data stored in GPU mem. cast him in to integer value pointer.ToInt64(); and store this value in your Structure object simply as long value(not long pointer). than you can use the GThread.InsertCode() method to insert directly code into your kernel without compiling. You can not use pointer directly in your kernel code becase they are not blittable data type. So stop talking here is the example of my working code

class Program
{
    [Cudafy]
    public struct TestStruct
    {
        public double value;
        public long dataPointer; // your data pointer adress
    }

    [Cudafy]
    public static void kernelTest(GThread thread, TestStruct[] structure, int[] intArray)
    {
        // Do something 
        GThread.InsertCode("int* pointer = (int*)structure[0].dataPointer;");
        GThread.InsertCode("structure[0].value = pointer[1];");             // Here you can acces your data using pointer pointer[0], pointer[1] and so on
    }


    private unsafe static void Main(string[] args)
    {

            GPGPU gpuCuda = CudafyHost.GetDevice(eGPUType.Cuda, 0);
            CudafyModule km = CudafyTranslator.Cudafy();
            gpuCuda.LoadModule(km);

            TestStruct[] host_array = new TestStruct[1];
            host_array[0] = new TestStruct();

            int[] host_intArray = new[] {1, 8, 3};
            int[] dev_intArray = gpuCuda.CopyToDevice(host_intArray);

            DevicePtrEx p = gpuCuda.GetDeviceMemory(dev_intArray);
            IntPtr pointer = p.Pointer;

            host_array[0].dataPointer = pointer.ToInt64();


            TestStruct[] dev_array = gpuCuda.Allocate(host_array);
            gpuCuda.CopyToDevice(host_array, dev_array);

            gpuCuda.Launch().kernelTest(dev_array, dev_intArray);

            gpuCuda.CopyFromDevice(dev_array, host_array);

            Console.WriteLine(host_array[0].value);

            Console.ReadKey();
    }
}

The "magic" is in InsertCode() where you cast your long dataPointer value as int pointer adress... but the disadvantage of this approache is that you must write those parts of code as String.

OR you can separate your data and structures e.g.

[Cudafy]
public struct Texture
{
    public int Width, Height;
}

[Cudafy]
    public static void kernelTest(GThread thread, Texture[] TexStructure, byte[] Data)
    {....}

And simply copy

dev_Data = gpu.CopyToDevice(host_Data);
dev_Texture = gpu.CopyToDevice(host_Texture);
gpu.Launch().kernelTest(dev_Texture, dev_Data);

EDIT TWO: forget about my code :D

Check this https://cudafy.codeplex.com/discussions/538310 and THIS is solution for your problem https://cudafy.codeplex.com/discussions/283527