I am sort of a newbie to assembly language, and I need help understanding how mnemonics are converted directly to bytes.
For example, I have a line saying
b 0x00002B78
which is located at the memory address 0x00002A44. How does this translate to EA00004B (the byte representation of the above assembly)? I am under the impression that the "EA00" signifies the "b" branching part of the assembly, but what about the "004B"? If anyone can give a general understanding of this and resources to find conversions and such, that would be appreciated. I tried googling this but I am really not to sure what to google exactly. The stuff I have been googling has not been helpful.
All the information you're looking for is in the ARM Architecture Reference Manual. If you look up the
binstruction, you'll see its encoding and how it works. Here's the specific instruction you care about:The
Eis the condition field, which you can look up in this table:For you, it's "execute always". Then the
A, which in binary is the1010to match bits 27:24 (you have a branch instruction, not a branch & link instruction). Lastly the rest of the instruction is the immediate offset field. It's a PC-relative offset, which is why it's encoded as0x00004b.Let's look at your specific example now. You have the instruction:
located at address
0x00002a44. OK, great. So first off, we can stick in the opcode bits:Now, the
Lbit is zero for our case:We want to execute this instruction unconditionally, so we add the
ALcondition code bits:And now all we have to do is calculate the offset. The PC will be
0x2a4cwhen this instruction is executed (the PC is always "current instruction + 8" in ARM), so our relative jump needs to be:Great - now we apply the reverse of the transformations described in the documentation above, rightshifting
0x12cby two:And that's the last field:
Turning that binary instruction back into hex gives you the instruction encoding you were looking for: