Why compiler point to global variable using [rip+0x00]

68 Views Asked by At

I am new to compiler, and I am trying to understand why the compiler emit this code. I have this very simple C code

// main.c
int b = 7;

int main(){
    int a = 5 + b;
    return 0;
}

I compiled this using gcc 13.2 by running

gcc -g -c -o main.o main.c

I inspect the code by running

objdump -M intel -d main.o

And this is the result that I get

main.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <main>:
   0:   f3 0f 1e fa             endbr64
   4:   55                      push   rbp
   5:   48 89 e5                mov    rbp,rsp
   8:   8b 05 00 00 00 00       mov    eax,DWORD PTR [rip+0x0]        # e <main+0xe>
   e:   83 c0 05                add    eax,0x5
  11:   89 45 fc                mov    DWORD PTR [rbp-0x4],eax
  14:   b8 00 00 00 00          mov    eax,0x0
  19:   5d                      pop    rbp
  1a:   c3                      ret

What I don't understand is the line 8 mov eax,DWORD PTR [rip+0x0] . From what I can understand, rip register is the instruction pointer and will point to the next instruction. This means the code try to load the value at the address [rip+0x0], which is the next instruction? But the next instruction is not the value we want?

I am aware that this is the compiled file and when linking happens these addresses can change (I did link the file gcc -o final main.o) and the address after linking actually makes sense (it points to somewhere else - I presume the .data section that contains the actual value). However, I don't understand the code in the compiled file

I include the file in this Compiler Explorer. In the default view the code is ok, but if you turn on the option "Link to Binary" in the "Output" options with -c as a compiler option, you'' see the same output as objdump. Without -c, you'll see a sensible address from the actual linked executable, or with "compile to binary object" you'll see the same [RIP+0] but with an extra line showing stuff about the symbol name it's supposed to reference.

EDIT: I tried the suggestion by @Peter Cordes and still get the same result. I tried with another example that I think can demonstrate my point better

int b = 7;
int c = 3;

int main(){
    int a = 5 + b;
    int d = 6 + c;
    return 0;
}

And this is the disassembled file through gcc -c and objdump -drwC -Mintel

main.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <main>:
   0:   f3 0f 1e fa             endbr64
   4:   55                      push   rbp
   5:   48 89 e5                mov    rbp,rsp
   8:   8b 05 00 00 00 00       mov    eax,DWORD PTR [rip+0x0]        # e <main+0xe>    a: R_X86_64_PC32     b-0x4
   e:   83 c0 05                add    eax,0x5
  11:   89 45 f8                mov    DWORD PTR [rbp-0x8],eax
  14:   8b 05 00 00 00 00       mov    eax,DWORD PTR [rip+0x0]        # 1a <main+0x1a>  16: R_X86_64_PC32    c-0x4
  1a:   83 c0 06                add    eax,0x6
  1d:   89 45 fc                mov    DWORD PTR [rbp-0x4],eax
  20:   b8 00 00 00 00          mov    eax,0x0
  25:   5d                      pop    rbp
  26:   c3                      ret

As you can see, both global variables somehow are pointed by the "same" address [rip+0x0], which is not clear to me why this is the case and what does this [rip+0x0] mean

0

There are 0 best solutions below