I am attempting to debug a very low level data fault in a TI AM 3358 MCU. It is coming from floating point math.
The system uses TI RTOS, the GNU 7.3.1 Compiler, and VFPv3 (is VFP a compiler settings? a FP math library? I'm not clear on the floating point code generation). So although I have disassembly listing fragments, the fix needs to be at the C code level.
This is a two part question:
First do I understand the mnemonics correctly? And why are some not listed?
I noticed the disassembly has opcodes that there are no mnemonics for. Here is a list fragment, no need to get into details here yet. Just notice mnemonics are missing, and I don't think they are immediate data (comments added by me as I reverse engineered the compiled code):
8003ced0: EEF1FA10 vmrs apsr_nzcv, fpscr ; Pull STAT reg to ARM MCU
8003ced4: DA000041 ble #0x8003cfe0 ; branch less-equal to x0x...3cfe0
8003ced8: EEFD7BE0 .word 0xeefd7be0 ; ??? What is this
8003cedc: EDC47A0A vstr s15, [r4, #0x28] ; Store S15 <- r4+28 = st->f2.z
8003cee0: E584702C str r7, [r4, #0x2c] ; Store r7 <- r4+2c = st->f2.a
8003cee4: E3A03000 mov r3, #0
8003cee8: E5843030 str r3, [r4, #0x30]
8003ceec: EE07CA90 vmov s15, r12 ; ( I decode this below)
8003cef0: EEF80BE7 .word 0xeef80be7 ; ???
8003cef4: EE702BA2 .word 0xee702ba2 ; ???
8003cef8: EEFD7BE2 .word 0xeefd7be2 ; ???
8003cefc: EDC47A0D vstr s15, [r4, #0x34]
8003cf00: E5845018 str r5, [r4, #0x18]
8003cf04: EE701BA1 .word 0xee701ba1
8003cf08: EEFD7BE1 .word 0xeefd7be1
To be sure I could understand VFPv3 mnemonics, I decoded address 8002ceec as the following:
8003ceec: EE07CA90 vmov s15, r12
VMOV (between ARM core register and single-precision register)
1110 unconditional
1110
0000 opt = 0: so this is TO the VFP
0111 Vn = 7 (but still need one more bit from nibble 1)
1100 Rt = 12
1010
1001 N = 1 (so n = 01111 =S15)
0000
It came from https://developer.arm.com/documentation/ddi0406/c/Application-Level-Architecture/Instruction-Details/Alphabetical-list-of-instructions/VMOV--between-ARM-core-register-and-single-precision-register-?lang=en, (I'm pretty sure I got this correct, if not, any correction welcome)
So, what are op codes 0xeef80be7, 0xee702ba2, etc.? I am unable to decipher them in the ARM books or sites. Following the VFP/NEON pattern, this is some kind of 'unconditional move' but beyond that, I can't match the bit pattern to anything (and the web site is extremely unfriendly do this kind of search, I resorted to downloading a PDF and doing a bit search).
As for the second question, if there is an easy obvious answer, I'd appreciate being steered in the right direction.
This is a compiled C function which passes in a pointer to a structure. Then pulls members out of it and does some floating point math. I determined the structure address is stored in R4.
And example prototype would be
int Function(int x, int y, struct *a);
And is called as (fictional example)
Function (5,5,&st[0]);
later on
Function (5,7,&st[1]);
There is a Data Abort crash which only occurs when accessing the second structure. Never when accessing the first. And only when the VFP/Neon is accessing it, not the regular ARM registers.
Getting into the mud of the code, R4 is the address of the structure passed in:
8003cfe0: EEFD7BE0 .word 0xeefd7be0 ; branch lands here
8003cfe4: EDC47A06 vstr s15, [r4, #0x18] ; CRASH Store S15 <- r4+24 = st->f1.x
8003cfe8: E584C01C str r12, [r4, #0x1c] ; r12 = st->f1.y
8003cfec: E3A03000 mov r3, #0
8003cff0: E5843020 str r3, [r4, #0x20]
I verified all the offsets of the members from the pointers, and everything is correct.
Repeating, the crash occurs at address 8003cfe4, but only when the R4 pointer is pointing to the st[1], never when pointing to st[0].
I know a "Data Abort" comes from attempting to access memory that the MMU is not configured permissions for. And yet, everything else can access all the members of st[1]. This is only when the VFP code tries to access is.
In fact, at addresses 8003cedc, 8003cee0, and 8003cee8, which all execute before address 8003cfe4, can happily accessed members of that structure. Which makes me believe this is not a MMU access issue?
Could it be the result of a cache miss? Or is there some other VFP issue trying to move between the VFP system and memory? Or is there an issue where the coprocessor isn't ready yet?
I was able to get around this crash by removing all the floating point math. But that really harms the functionality of the application. I'd much prefer that the floating point math for correctly.
Any ideas would be welcomed.
-Scotty
While I don't have an answer to the unknown op codes, in an answer to the second part, the VFP coprocessor must have data transferred into and out of it on proper boundaries, in this case 4 bytes.
While the offsets into the structure were correctly aligned, the base of the structure itself was not. It started (due to packing) at address0x...2931. So the offset at 40 bytes in (+0x28) was on an odd number address.
Simply adding
at the end of the structure declaration solved the problem.
*** Update ***
I attempted many ways to replicate this issue in a code fragment. In all cases, the compiler generated code that moved the 32bit value from memory to a register before moving it into the neon processor.
I was able to forcibly cause the data fault with an inline assembler statement attempting to move unaligned data directly from an odd numbered address to the neon processor.
(R0 contained the base address ending in x1)
Therefore this is likely an optimization bug in the GNU compiler.
I post this in the event someone else gets bitten by this issue.