I have an array of two bytes with bits order like this:
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
a0 b0 a1 b1 a2 b2 a3 b3 a4 b4 a5 b5 a6 b6 a7 b7
I want to transform them into new two bytes like this: a0a1a2a3a4a5a6a7 b0b1b2b3b4b5b6b7
The intuitive way would be using a for loop for each bit. Is there any faster way to do this ? Or can I use more specialized hardware like gpu and opencl to do this ?
I have 150 KB of these data each 20ms, and I would like to finish processing them in 20ms.