What is the difference of dynamic shared memory as kernel attribute and kernel argument in CUDA

58 Views Asked by msedi At 19 January 2024 at 15:18

Wer are using dynamic shared memory in our CUDA kernels. We are setting the size of the shared memory for each kernel using the driver API cuFuncSetAttribute and CU_FUNC_ATTRIBUTE_MAX_DYNAMIC_SHARED_SIZE_BYTES.

The kernel is then launched using cuLaunchKernel where in the docs one of the parameter is unsigned int sharedMemBytes. This parameter is defined to set

Dynamic shared-memory size per thread block in bytes

This means I can set the dynamic memory size per kernel attribute and additionally I can set the shared memory size per kernel call.

Does this mean I can override the kernel attribute? Which one wins?

Original Q&A

There are 1 best solutions below

einpoklum On 19 January 2024 at 15:23

kernel attribute -> maximum value
Launch configuration field -> actual value

Says so right in the name: MAX_DYNAMIC_SHARED_SIZE_BYTES vs sharedMemBytes. Note the MAX prefix :-)

Setting a different maximum value may effect the GPU's behavior when running the kernel, e.g. the allocation of regular L1 cache for use by the kernel (as in some/most NVIDIA GPU micro-architectures, shared memory is repurposed L1 cache, and their total amount is fixed but the proportions aren't; see also §16.6.4 of the CUDA C++ Programming Guide).

Now, it's true that passing a specific amount of shared memory could have implicitly done whatever setting maximum does; but - either that has somewhat of an overhead, or - it's just how NVIDIA has chosen to do things.

What is the difference of dynamic shared memory as kernel attribute and kernel argument in CUDA

There are 1 best solutions below

Related Questions in CUDA

Related Questions in GPU-SHARED-MEMORY

Trending Questions

Popular # Hahtags

Popular Questions