C inline assmebly write syscall prints only 4 characters but only when using "=a" as output register

89 Views Asked by At

As a uni task I wrote a very simple wrapper for write syscall. It is for i386. Code gets compiled with:

gcc -ffreestanding -fno-stack-protector -nostdlib -nostdinc -static -m32 -Wall -g -O2

I am wondering why the commented out code prints out only 4 characters, no matter what the num_bytes is.

int my_write(int fd, void *buf, unsigned num_bytes){
int ret;
asm volatile (
    "mov $4, %%eax;"
    "mov %1, %%ebx;"
    "mov %2, %%ecx;"
    "mov %3, %%edx;"
    "int $0x80;"
    : "=r"(ret)
    : "r"(fd), "r"(buf), "r"(num_bytes)
    : "eax", "ebx", "ecx", "edx", "memory"
);
/* Does not work
asm volatile (
    "mov $4, %%eax;"
    "mov %1, %%ebx;"
    "mov %2, %%ecx;"
    "mov %3, %%edx;"
    "int $0x80;"
    "mov %%eax, %0;"
    : "=a"(ret)
    : "r"(fd), "r"(buf), "r"(num_bytes)
    : "ebx", "ecx", "edx", "memory"
); 
*/
return ret;
}

I am aware i can use:

    asm volatile (
    "int $0x80;"
    : "=a"(ret)
    : "a"(4), "b"(fd), "c"(buf), "d"(num_bytes)
    : "memory"
    );

and that works just fine as well. However I'd like to know what is so different between the two approaches above. It should not matter if I use =a or =r since the calling convention of i386 states that the return value of syscall is in eax.

I tried writing a static buffer and use fixed num_bytes, but the issue remained. Also i tried compiling with another optimization flag -Og. Using any except for -O2 will cause the code to exit with status 1 immediately after launch, even the version that works ok otherwise. But that could be caused by some other function in the code, as objdumps of the my_write section compiled with -O2 and -Og do not show differences. However I did not find a bug in any other function so I`m hoping somebody has an idea. Thanks for any input. Edit: excuse the fight with the formatting please

1

There are 1 best solutions below

2
rodrigo On

It looks to me that you take the clobber list of registers as a list of reserved registers, that you want to use for yourself, and think that the compiler will stay away from them.

But that's not the case. What the clobber list does is to inform the compiler that these registers may have changed value during the execution of this asm and that the values after that are to be considered undefined.

But by using registers by name in your code and using a r constraint, you risk that the compiler will choose the same register for that constraint that you are using in the code. And that will most likely fail.

For debugging these kinds of issues you should disassemble the generated code and compare it with what you intent. For example, your commented code:

asm volatile (
    "mov $4, %%eax;"
    "mov %1, %%ebx;"
    "mov %2, %%ecx;"
    "mov %3, %%edx;"
    "int $0x80;"
    "mov %%eax, %0;"
    : "=a"(ret)
    : "r"(fd), "r"(buf), "r"(num_bytes)
    : "ebx", "ecx", "edx", "memory"
);

in my machine disassembles to:

mov    0x10(%esp),%eax  // load fd
mov    0x14(%esp),%esi  // load buf
mov    0x18(%esp),%edi  // load num_bytes
mov    $0x4,%eax   // ooops!
mov    %eax,%ebx
mov    %esi,%ecx
mov    %edi,%edx
int    $0x80
mov    %eax,%eax  // store ret

As you can see, the chosen register for %1 is eax but you are overwriting it with a $4 before it can be used.

The non-commented code works just by chance, because the eax register is not chosen for anything.

You have two options to solve this. The simplest is to use specific registers for the constraints, as you show at the end of your question. That way there are no clashes.

The other solution is to be extra careful not to overwrite any register while it may still be needed. Something like this:

asm volatile (
    "push %1;"
    "push %2;"
    "push %3;"
    "pop %%edx;"
    "pop %%ecx;"
    "pop %%ebx;"
    "mov $4, %%eax;"
    "int $0x80;"
    : "=a"(ret)
    : "r"(fd), "r"(buf), "r"(num_bytes)
    : "ebx", "ecx", "edx", "memory"
);

This compiles to:

mov    0x10(%esp),%eax
mov    0x14(%esp),%esi
mov    0x18(%esp),%edi
push   %eax
push   %esi
push   %edi
pop    %edx
pop    %ecx
pop    %ebx
mov    $0x4,%eax
int    $0x80

That looks a bit silly with all this stack dance but it should work.

BTW, if you write "=a"(ret) as an output constraint, then the mov %%eax, %0 is redundant. That is why your code gets an extra mov %eax, %eax at the end.