Why <puts> is called?

197 Views Asked by At

I'm trying to see the disassembled binary of a simple C program in gdb.

C program :

int main(){
        int i = 2;
        if (i == 0){
                printf("YES, it's 0!\n");
        }else{
                printf("NO");
        }
        return 0;
}

The disassembled instructions :

   0x0000000100401080 <+0>:     push   rbp
   0x0000000100401081 <+1>:     mov    rbp,rsp
   0x0000000100401084 <+4>:     sub    rsp,0x30
   0x0000000100401088 <+8>:     call   0x1004010e0 <__main>
   0x000000010040108d <+13>:    mov    DWORD PTR [rbp-0x4],0x2
   0x0000000100401094 <+20>:    cmp    DWORD PTR [rbp-0x4],0x0
   0x0000000100401098 <+24>:    jne    0x1004010ab <main+43>
   0x000000010040109a <+26>:    lea    rax,[rip+0x1f5f]        # 0x100403000
   0x00000001004010a1 <+33>:    mov    rcx,rax
   0x00000001004010a4 <+36>:    call   0x100401100 <puts>
   0x00000001004010a9 <+41>:    jmp    0x1004010ba <main+58>
   0x00000001004010ab <+43>:    lea    rax,[rip+0x1f5b]        # 0x10040300d
   0x00000001004010b2 <+50>:    mov    rcx,rax
   0x00000001004010b5 <+53>:    call   0x1004010f0 <printf>
   0x00000001004010ba <+58>:    mov    eax,0x0
   0x00000001004010bf <+63>:    add    rsp,0x30
   0x00000001004010c3 <+67>:    pop    rbp
   0x00000001004010c4 <+68>:    ret
   0x00000001004010c5 <+69>:    nop

And I suppose the instruction,

0x00000001004010a4 <+36>:    call   0x100401100 <puts>

points to

printf("YES, it's 0!\n");

Now let us assume it is, then my doubt is why <push> is called here , but <printf> is called at 0x00000001004010b5 <+53>: call 0x1004010f0 <printf> ?

2

There are 2 best solutions below

2
dbush On

It's an optimization.

Calling printf with a format string that has no format specifiers and a trailing newline is equivalent to calling puts with the same string with the trailing newline removed.

Since printf has a lot of logic for handling format specifiers but puts just writes the string given, the latter will be faster. So in the case of the first call to printf the compiler sees this equivalence and makes the appropriate substitution.

9
chqrlie On

Using the semantics defined in the C Standard, printf("YES, it's 0!\n") produces the same output as puts("YES, it's 0!"), which may be more efficient as the string does not need to be analysed for replacements.

Since the return value is not used, the compiler can replace the printf call with the equivalent call to puts.

This type of optimisation was likely introduced as a way to reduce the executable size for the classic K&R program hello.c. Replacing the printf with puts avoids linking the printf code which is substantially larger than that of puts. In your case, this optimisation is counter productive as both puts and printf are linked, but modern systems use dynamic linking, so it is no longer meaningful to try and reduce executable size this way.

You can play with compiler settings on this Godbolt compiler explorer page to observe compiler behavior:

  • even with -O0, gcc performs the printf / puts substitution, but clang does not and both compilers generate code for both calls, not optimizing the test if (i == 0), which is OK with optimisations disabled. I suspect the gcc team could not resist biassing size benchmarks even with optimisations disabled.

  • with -O1 and beyond, both compilers only generate code for the else branch, calling printf.

  • if you change the second string to just "N", printf is converted to a call to putchar, yet another optimisation.