OpenCL sub buffer alignment

41 Views Asked by At

I have a nested loop structure in which I want to work with a sub buffer of a OpenCL buffer object. However, the sub buffer is not alignment to the CL_DEVICE_MEM_BASE_ADDR_ALIGN of my device (which is 128 Bytes). I used bitwise operation to align the buffer as required by the hardware:

cl_mem sub_buff;
for(int u = 0; u < u_max; u++) {
    for (int v = 0; v < v_max; v++) {
       int offset = (u * v_max + v) * (size_of_slice);
       size_t base_align = 128;  // Bytes
       size_t aligned_offset = (offset + base_align - 1) & ~(base_align - 1); // align to next multiple of base_align
       cl_buffer_region subBufferRegion = {aligned_offset,
                                            size_of_slice * sizeof(float)}; 
       i_fft_buff = clCreateSubBuffer(data_buff_d, CL_MEM_READ_WRITE,
                                        CL_BUFFER_CREATE_TYPE_REGION,
                                        &subBufferRegion, &error);
       // ... continue to work with sub buffer
    }
}

But doing so, Im missing out on elements because aligned_offset is rounding-up to the next multiple of 128 Bytes.

I was wondering if there is a way of aligning the sub buffer to the required 128 Bytes and still getting all size_of_slice elements? I don't want to avoid creating a new cl_mem buffer in each iteration.

0

There are 0 best solutions below