#include <cstddef>
float cast(size_t sz){return sz;}
Compiling the above code on Clang 13.0.1 with -mavx2 -O3 produces the following code:
cast(unsigned long): # @cast(unsigned long)
test rdi, rdi
js .LBB0_1
vcvtsi2ss xmm0, xmm0, rdi
ret
.LBB0_1:
mov rax, rdi
shr rax
and edi, 1
or rdi, rax
vcvtsi2ss xmm0, xmm0, rdi
vaddss xmm0, xmm0, xmm0
ret
Similar code is also produced on GCC, MSVC and the Intel compiler, and for older versions too.
I understand the general goal of the algorithm is to work around the fact that there is no conversion instruction from a 64-bit unsigned int to float or double, until AVX-512.
So if the number is large enough to be interpreted as negative, it halves it, converts it, and then doubles it. However, what is the purpose of setting the least significant bit if it was set in the original integer?
It seems like a waste of time since the floating point number only has 23 significant bits, and this bit is guaranteed to not be significant. Perhaps if it were an increment instruction instead, it could affect the significant bits in some cases. But just an or instruction doesn't seem to do anything.
You are making an assumption that the bottom bit can never matter - this isn't true. There are a few corner case values in conversions where the bottom bit does matter. Consider:
This iterates through a number of large values and prints out (using the handy hex float modifier) when the bottom bit makes a difference. It shows that there are a few which need this adjustment:
Link: https://godbolt.org/z/jTeo1Ko4s