Is uint2 operations faster than ulong in OpenCL on AMD GCN cards?

375 Views Asked by user1200759 At 21 August 2018 at 20:50

Which of the "+" calculation is faster? 1) uint2 a, b, c; c = a + b; 2) ulong a, b, c; c = a + b;

There are 1 best solutions below

Hugo Maxwell On 05 October 2018 at 13:24

AMD GCN has no native 64-bit integer vector support, so the second statement would be translated into two 32-bit adds, one V_ADD_U32 followed by a V_ADDC_U32 which takes the carry flag from the first V_ADD_U32 into account.

So to answer your question they are both the same in terms of instruction count, however the first can be computed in parallel (instruction level parallelism) and could be faster IF your kernel is occupancy bound (ie. using lots of registers).

If your statements can be executed by the scalar unit (ie. they do not depend on the thread index) then the game changes and the second one will be just one instruction (vs. two) since the scalar unit has native 64-bit integer support.

However keep in mind your first statement is not the same as the second, you would lose the carry flag.

Is uint2 operations faster than ulong in OpenCL on AMD GCN cards?

There are 1 best solutions below

Related Questions in OPENCL

Related Questions in AMD-GPU

Related Questions in AMD-GCN

Trending Questions

Popular # Hahtags

Popular Questions