As following code shows, why uint32_t prevents the compiler (GCC 12.1 + O3) from optimizing by auto vectorization. See godbolt.
#include <cstdint>
// no auto vectorization
void test32(uint32_t *array, uint32_t &nread, uint32_t from, uint32_t to) {
for (uint32_t i = from; i < to; i++) {
array[nread++] = i;
}
}
// auto vectorization
void test64(uint32_t *array, uint64_t &nread, uint32_t from, uint32_t to) {
for (uint32_t i = from; i < to; i++) {
array[nread++] = i;
}
}
// no auto vectorization
void test_another_32(uint32_t *array, uint32_t &nread, uint32_t from, uint32_t to) {
uint32_t index = nread;
for (uint32_t i = from; i < to; i++) {
array[index++] = i;
}
nread = index;
}
// auto vectorization
void test_another_64(uint32_t *array, uint32_t &nread, uint32_t from, uint32_t to) {
uint64_t index = nread;
for (uint32_t i = from; i < to; i++) {
array[index++] = i;
}
nread = index;
}
After I ran the command g++ -O3 -fopt-info-vec-missed -c test.cc -o /dev/null, I got the following result. How to interpret it?
bash> g++ -O3 -fopt-info-vec-missed -c test.cc -o /dev/null
test.cc:5:31: missed: couldn't vectorize loop
test.cc:6:24: missed: not vectorized: not suitable for scatter store *_5 = i_18;
test.cc:21:31: missed: couldn't vectorize loop
test.cc:22:24: missed: not vectorized: not suitable for scatter store *_4 = i_22;
Look at the function
and how it should behave if you call it like this:
This is called aliasing. The
nreadparameter might alias elements fromarraybecause they have the same type. But when you havethen no aliasing can occur because an
uint32_tanduint64_tcan never have the same address.Note: passing a reference to a function internally passes the address so it's equivalent to a pointer for the argument of aliasing.
There are some types with special rules called aliasing types. The C++ standard says that you can cast an
uint32_t*tochar*and then access the raw memory underlying theuint32_t. That means anuint32_t*andchar*can legally point at the same address.char*is an aliasing type because it aliases with any other type of (data) pointer. So isunsigned char*or any other variation ofcharincludingstd::byte.But you can tell the compiler that 2 pointers are not allowed to alias even if the type would permit it by using restrict.
PS:
test_another_32looks like a missed compiler optiomization.