Running Atmel Studio with its provided gcc 6.3.1 to build firmware for an Atmel/Microchip SAMV70 (ARM Cortex-M7) chip. I have code that compares a 4-byte input array to a 4-byte local array using memcmp(). When compiled with -O0 to disable optimizations it works fine. When compiled with -Os to optimize for size or with -O3 for max optimization, the compiler is replacing the memcmp() call with a direct 4-byte comparison (verified by examining the disassembly). Unfortunately the optimization also sometimes moves the local 4-byte array to an unaligned starting address, so while memcmp() would work fine the direct comparison triggers a HardFault due to unaligned access.
In my opinion this is 100% a compiler optimization bug (possibly gcc, possibly something Atmel added), but I'm stuck with the provided compiler so updating isn't an option. So here's my actual question: Is there a way to keep optimizations enabled but disable this particular optimization? Otherwise I'm stuck forcing the local 4-byte arrays to be 4-byte aligned or finding some other workaround.
Compiler version:
gcc version 6.3.1 20170620 (release) [ARM/embedded-6-branch revision 249437] (Atmel build: 508)
Here's an example function that could trigger the fault:
bool example(uint8_t *input_data)
{
uint8_t local_data[4] = { 0x00, 0x01, 0x02, 0x03 };
return (memcmp(input_data, local_data, 4) == 0);
}
My code is always passing in a 4-byte-aligned input_data so that's not an issue, but once again it's bad form for the compiler optimizations to take that for granted.
Answering my own question since Eugene didn't post an official answer:
From the gcc ARM options:
That means unaligned access is allowed by default for ARMv7-M. It turns out this makes sense, because from the ARMv7-M Architecture Reference Manual:
Which means that ARMv7-M supports a limited set of unaligned accesses. However, it does not support all unaligned accesses:
And also:
So here's the failure condition that prompted me to ask the initial question:
memcmp()with a direct comparison, which is allowed by default, because unaligned accesses are allowed by default. So that is not a compiler bug.memcmp()was located in an MPU segment that was declaredStrongly Ordered, which does not support unaligned access. Thus when thememcmp()was replaced with a direct compare, and the data fell on an unaligned address, the compare was triggering aHardFault.The fix, which Eugene got correct in his initial comment, is to add
-mno-unaligned-accessto the compiler options. In my case this still allowed the compiler to replace thememcmp()with a direct 4-byte comparison but it also forced the data to be 4-byte aligned, allowing the compare to succeed without triggering a fault condition.