I'm looking at the following disassembled AArch64 instruction:
65 6E 20 2B adds w5, w19, w0, uxtx #3
According to the ARM manual, uxtx zero-extends w0 to an unsigned 64-bit value before adding it to the value in w19. But w19 is a 32-bit "slice" of x19, and the result is stored in a 32-bit slice of x5. That is, the sizes of the operation's values differ.
The question is not restricted to adds; other AArch64 instructions like add or sub exhibit the same encoding. The question also applies to the 64-bit sxtx signed extension, which due to sign extension issues might very well be expected to not behave the same as the 32-bit sxtw.
Are uxtx and sxtx acting exactly like uxtw and sxtx respectively when used with 32-bit register slices? If so, what value is ARM providing by supporting both [us]xtw and [us]xtx extension encodings for these apparently identical operations? If not, is there a difference that would be visible to the user program?
They all do the same thing, i.e. nothing.
As you say, logically, sign- or zero-extending a value to a width larger than the operand size should not actually affect the value used, and that's correct. You can confirm it with a careful reading of the pseudocode in the Architecture Reference Manual. In the code for
ExtendReg, note the linelen = Min(len, N - shift). HereNis 32, so it makes no difference whetherlenis 32 or 64.Similarly,
uxtxandsxtxare both no-ops for either 32-bit or 64-bit instructions.So the following instructions all have exactly the same architectural effect, performing the operation
w0 = w1 + (w2 << 3). I actually tested them with a selection of chosen and random inputs, verifying that the results and flags are identical for all five.However, note that their encodings are different.
And that is also why they use different mnemonics for the extension operation: one of the principles of the ARM64 assembly language is that every legal binary encoding should have its own unambiguous assembly. So if for some obscure reason you care whether you get the encoding
0x2b224c20or0x2b226c20-- say you are trying to write shellcode where certain bytes are forbidden -- you can specifyuxtworuxtxto select the one you want. This also means that if you disassemble and reassemble a section of code, you will get back the identical binary that you put in.(Contrast the situation in x86 assembly language, where redundant encodings do not get distinct mnemonics. So
add edx, ecxmay assemble to either01 ca(the "store form") or03 d1("load form"), and assemblers often don't give you any way to pick which one. Likewise both encodings will disassemble toadd edx, ecx, so if you disassemble and reassemble you may not end up with the same binary you started with. See How to resolve ambivalence in x64 assembly? and its duplicate links.)The mnemonics for the extension operators reflect the encoding structure, which also helps to explain why the redundant encodings exist in the first place. The extension type is encoded in a 3-bit "option" field, bits 13-15 of the instruction. Bits 13-14 specify the width of the value to be extended:
00= 8-bit byteB01= 16-bit halfwordH10= 32-bit wordW11= 64-bit doublewordXNote that
Xis always effectively "no extension". Then bit 15 specifies the signedness: 0 = unsignedU, 1 = signedS. So010 = uxtwand011 = uxtxsince that is what they logically specify, even though for a 32-bit operation, both have the same actual effect (i.e. none).This might seem like a waste of the instruction space, but presumably it allows the decoder hardware to be simpler than if the otherwise redundant encodings were to select some different operation.
The last option listed above,
adds w0, w1, w2, lsl #3has a different encoding altogether because it selects the "Add (shifted register)" opcode, instead of the "Add (extended register)" opcode as the first four do. So this is another redundancy; an add without extension, with a left shift of 0-4 bits, can be done with either opcode. However, this is not entirely useless, because the "extended register" form can use the stack pointer registerspas an operand, while the "shifted register" can use the zero registerxzr/wzr. Both registers are encoded as "register 31", so each opcode has to specify whether it interprets "register 31" as the stack pointer or as the zero register. So the fact that the two opcodes have overlapping effect lets the instruction set provide addition using either the stack pointer or the zero register, where otherwise only one or the other could be supported.The
sxt/uxtsyntax shows up in a couple other places in the ARM64 assembly language, with slightly different details in each case.The
sxt*/uxt*instructions, which simply sign- or zero-extend one register into another. They are aliases for special cases of thesbfm/ubfmbitfield move instructions.sxtb, sxth, uxtb, uxthwork with either a 32- or 64-bit destination, andsxtw x0, w1with a 64-bit destination only.The GNU assembler at least also supports
uxtw w0, w1anduxtw x0, w1, although the official Architecture Reference Manual does not document them. But they are both just aliases formov w0, w1, since writes to 32-bit registers always zero the high half of the corresponding 64-bit register. (And a fun fact is thatmov w0, w1is itself an alias fororr w0, wzr, w1, a bitwise OR with the zero register.)There are no mnemonics for the trivial
uxtx, sxtxwhich would just be a 64-bit move. I suppose logicallyuxtx x0, x1could be an alias ofubfm x0, x1, #0, #63, encoded as0xd340fc20, but they didn't bother to support it. Theuxtxoperator toaddsis needed because otherwise there would be no way to assemble0x2b226c20, but since0xd340fc20can already be obtained withubfmit doesn't need another redundant name. (Actually it seemsubfm x0, x1, #0, #63disassembles aslsr x0, x1, #0, since the immediate shift instructions are also aliases for bitfield move.) Likewise, the uselesssxtw w0, w1is also rejected by the assembler.The extended-register addressing modes for the load, store, and prefetch instructions. They normally take 64-bit base and index registers
ldr x0, [x1, x2], but the index can also be specified as a 32-bit register with either zero or sign extension:ldr x0, [x1, w2, uxtw]orldr x0, [x1, w2, sxtw].Here there is again a redundant encoding that appears. These instructions contain a 3-bit "option" field with the same position and format as for
addand friends, but here the byte and half-word versions are unsupported, so the encodings with bit 14 = 0 are undefined. Of the remaining four combinations,uxtw (010)andsxtw (110)make perfect sense. The other two use a 64-bit index with no extension, and so have the same effect as each other, but they need to be assigned distinct assembly syntax. The110encoding, which might logically beuxtx, is designated the "preferred" encoding and is written with no operator asldr x0, [x1, x2], orldr x0, [x1, x2, lsl #3]for the shifted-index the shifted version. The redundant111encoding is then selected withldr x0, [x1, x2, sxtx]orldr x0, [x1, x2, sxtx #3]The
uxtl/sxtlExtend Long SIMD instructions, which zero- or sign-extend the elements of a vector to double their original width. These are actually aliases for theushll/sshlllong shift instructions, with a shift count of 0. But otherwise there is nothing unusual about their encodings.