Including a library in C, where does it place in the address space?

558 Views Asked by At

I understand how an address space is partitioned into: code, data, stack and heap. However, I am having trouble mapping what goes to where for a given C code.

I know that: global variables are in the data section. static variables are in the data section. local variables are in the stack section. dynamic allocated space are in the heap section. My questions is, when including a library into a program, where does it place in an address space?

I hope this question makes sense..

2

There are 2 best solutions below

0
On

Actually if you have linux based pc, you can check it by yourself in the following way:

  1. create a simple c application with a infinite while loop inside of main().
  2. compile

    $ gcc -o main ./main.c -g

  3. launch

    $ gdb ./main

  4. Show mapping info

    (gdb) r

    (gdb) info proc mappings

    Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
            0x400000           0x401000     0x1000        0x0 /tmp/main
            0x600000           0x601000     0x1000        0x0 /tmp/main
            0x601000           0x602000     0x1000     0x1000 /tmp/main
            0x602000           0x623000    0x21000        0x0 [heap]
      0x7ffff7a0d000     0x7ffff7bcd000   0x1c0000        0x0 /lib/x86_64-linux-gnu/libc-2.23.so
      0x7ffff7bcd000     0x7ffff7dcd000   0x200000   0x1c0000 /lib/x86_64-linux-gnu/libc-2.23.so
      0x7ffff7dcd000     0x7ffff7dd1000     0x4000   0x1c0000 /lib/x86_64-linux-gnu/libc-2.23.so
      0x7ffff7dd1000     0x7ffff7dd3000     0x2000   0x1c4000 /lib/x86_64-linux-gnu/libc-2.23.so
      0x7ffff7dd3000     0x7ffff7dd7000     0x4000        0x0
      0x7ffff7dd7000     0x7ffff7dfd000    0x26000        0x0 /lib/x86_64-linux-gnu/ld-2.23.so
      0x7ffff7fd4000     0x7ffff7fd7000     0x3000        0x0
      0x7ffff7ff6000     0x7ffff7ff8000     0x2000        0x0
      0x7ffff7ff8000     0x7ffff7ffa000     0x2000        0x0 [vvar]
      0x7ffff7ffa000     0x7ffff7ffc000     0x2000        0x0 [vdso]
      0x7ffff7ffc000     0x7ffff7ffd000     0x1000    0x25000 /lib/x86_64-linux-gnu/ld-2.23.so
      0x7ffff7ffd000     0x7ffff7ffe000     0x1000    0x26000 /lib/x86_64-linux-gnu/ld-2.23.so
      0x7ffff7ffe000     0x7ffff7fff000     0x1000        0x0
      0x7ffffffdd000     0x7ffffffff000    0x22000        0x0 [stack]
    

So we see, that ld-so has placed c library to the addresses 0x7ffff7bcd000 - 0x7ffff7dd5000. The offset field - is an offset in the ELF file itself. We can check which sections corresponds to which offset using readelf:

$ readelf -a /lib/x86_64-linux-gnu/libc-2.23.so | less

Foe example:

 [13] .text             PROGBITS         000000000001f8b0  0001f8b0
       0000000000153214  0000000000000000  AX       0     0     16

That means that .text section have offset 0x1f8b0. From the mapping above, we can conclude that virtual address of the beginning of the .text section in main app address space will be 0x7ffff7bcd000 + 0x1f8b0

0
On

You are starting from a bad premise:

I understand how an address space is partitioned into: code, data, stack and heap. However, I am having trouble mapping what goes to where for a given C code.

The address space is not partitioned in this way. The address space contains memory. The memory used for stack is indistinguishable from memory used for heap. The only thing that makes a stack a stack is that it is being allocated in the application using a stack pointer. The only thing that makes a heap a heap is that that there is a heap manager in the application. You could allocate memory from a heap and assign it to the the hardware stack pointer and your memory is both heap and stack.

Here is another misconception:

I know that: global variables are in the data section. static variables are in the data section. local variables are in the stack section. dynamic allocated space are in the heap section.

How things worker differs among assemblers, linkers, and systems. However, rational assemblers allow the user to define their own named sections. In many assemblers, I could create Bobs_Data_Section, Freds_Data_Section, and Sams_Data_section and put global variables in each of them.

With most (but not all) compilers, the programmer has no control over how it is going to create sections. There is no guarantee that a compiler is going to put global variables in a section called "data." In fact, global variables and static local variables could be in the same section.

Such "Sections" are generally only input to the linker. The linker groups sections defined by the assembler and compiler together into memory regions with common access attributes. What goes into the linker as, say, a "data" section comes out of the linker as instructions in the executable to create pages that are read/write.

My questions is, when including a library into a program, where does it place in an address space?

So now you have hit the problem of how you are trying to learn how things work. If you view the process address space as just being memory, you can load the library anywhere. The program loader reads the instructions in the executable file and creates pages with the correct attributes (e.g., readonly/noexecute, read/write, readonly/execute) anywhere available in the address space.

If you view the address space as being partitioned into code, data, etc. loading libraries becomes problematic. Which makes me wonder why schools and books persist in teaching using these nonsensical concepts.