I am trying to write a library for AVX2 in Ada 2012 using the GNAT GCC compiler. I have currently defined a data type Vec_256_Integer_32 like so:
type Vector_256_Integer_32 is array (0 .. 7) of Integer_32;
pragma Pack(Vec_256_Integer_32);
Note that I have aligned the array according to the 32 byte boundary indicated in Intel's documentation of the _mm256_load_si256 intrinsic function from immintrin.h.
I would like to implement an operation that adds two of these arrays together using AVX2. The function prototype is as follows.
function Vector_256_Integer_32_Add (Left, Right : Vector_256_Integer_32) return Vector_256_Integer_32
My idea for implementing this function is to do this in three steps.
- Load a and b using
_mm256_load_si256into a local variable. - Perform the addition operation using
_mm256_add_epi32. - Convert the result back into the
Vec_256_Unsigned_32type using_mm256_store_si256.
Where I am confused is how I would create the __m256i data type in Ada to hold the intermediate results. Can someone please shed some light on this? Additionally, if you see any issues with my approach, any feedback is appreciated.
I have found the definition of __m256i in GCC (located at gcc/gcc/config/i386/avxintrin.h).
typedef long long __m256i __attribute__ ((__vector_size__ (32), __may_alias__));
However, here is where I am stuck as I am not sure how I would transfer this to Ada code.
I have found that the __vector_size__ attribute is documented here.
I figured out the answer to my question after doing more research. Thank you for your input. I am posting this so hopefully someone else can get value from this.
Edit: I have adjusted my answer according to feedback from the commenter Peter Cordes.
For example, if you want to define a data type of 8 32-bit signed integers, you would write
The function to add the two vectors together would be defined as
Note that I am using the GCC intrinsic, rather than the intrinsics from immintrin.h (because I am not aware how to import an intrinsic from that header file).
The documentation of
_mm256_add_epi32states that thevpadddinstruction is used. The GCC__builtin_ia32_paddd256appears to translate to this instruction.Below is an example Ada program and ads file.
avx2.ads
main.adb
Here is an equivalent program in C. Note that this code has only been tested in GCC and is not necessarily the most efficient.