Link symbols in an ELF executable

67 Views Asked by At

My scenario is that I am given only a dynamically-linked, non-stripped ELF executable that contains some functions (symbols in text segment) I would like to call in my own binary.

Let's consider a reduced example. Suppose we have a ELF executable executable (which is compiled from gcc -o executable executable.c) and a main C file main.c.

// executable.c
// gcc -o executable executable.c
// We will only have the `executable` and the source is unavailable 

foobar() { return 87; }
main() { return 0; }

By nm executable, we could verify that it has the following symbols:

0000000000003e00 d _DYNAMIC
0000000000003fc0 d _GLOBAL_OFFSET_TABLE_
0000000000002000 R _IO_stdin_used
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
00000000000020e8 r __FRAME_END__
0000000000002004 r __GNU_EH_FRAME_HDR
0000000000004010 D __TMC_END__
000000000000038c r __abi_tag
0000000000004010 B __bss_start
                 w __cxa_finalize@GLIBC_2.2.5
0000000000004000 D __data_start
00000000000010e0 t __do_global_dtors_aux
0000000000003df8 d __do_global_dtors_aux_fini_array_entry
0000000000004008 D __dso_handle
0000000000003df0 d __frame_dummy_init_array_entry
                 w __gmon_start__
                 U __libc_start_main@GLIBC_2.34
0000000000004010 D _edata
0000000000004018 B _end
0000000000001148 T _fini
0000000000001000 T _init
0000000000001040 T _start
0000000000004010 b completed.0
0000000000004000 W data_start
0000000000001070 t deregister_tm_clones
0000000000001129 T foobar
0000000000001120 t frame_dummy
0000000000001138 T main
00000000000010a0 t register_tm_clones
// main.c

#include <stdio.h>

extern foobar();

main()
{
    printf("%d\n", foobar());
    return 0;
}

Obviously, if I were to link it baldly like this gcc -o main main.c executable, it would end up with multiple definitions of _init, _start, main, etc. With newer version of compiler / linker, it even said that cannot use executable file 'executable' as input to a link.

So could we compile the C file to a binary (or library) that links specific symbols in an executable? Or do there exist some approaches to extract these symbols from the executable?

Thanks!

2

There are 2 best solutions below

3
Employed Russian On

By nm executable, we could verify that it has the following symbols:

You should also run nm -D executable to see which (if any) of the defined symbols are exported for dynamic linking.

If foobar is exported, your task is trivial: create a shared library that calls foobar and use LD_PRELOAD to "inject" it.

But chances are that foobar is not exported, and in that case calling it is hard: you'll have to effectively generate the call to foobar using just-in-time compilation.

This task is easier if you only need the solution to work for a particular executable (IOW, if you know the address of the function you want to call ahead of time), and if that executable is not a PIE.

2
mevets On

Here is a really crude way:

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>


#define Z(x) if (!(x)) { fprintf(stderr, "error: %s evaluated false\n", #x); exit(1); }

int main(int ac, char **av) {
    int o;
    intptr_t f = 0;
    while ((o = getopt(ac,av,"f:")) != -1) {
        switch (o) {
        case 'f':
            f = strtol(optarg, 0, 0);
            break;
        default:
            exit(1);
        }
    }

    while (optind < ac) {
        int fd = open(av[optind++], O_RDONLY);
        Z(fd != -1);
        off_t len = lseek(fd, 0, SEEK_END);
        char *p = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);
        Z(p != MAP_FAILED);
        Z(mprotect(p, len, PROT_EXEC|PROT_READ|PROT_WRITE) == 0);
        int (*fp)(void) = (void *)(p + f);
        printf("%d\n", fp());
        close(fd);
        munmap(p, len);
    }
    return 0;
}

To run it, you need to find your symbol address:

$ nm p1 | grep foobar
0000000000001129 T foobar
$ # build above program:
$ gcc p2.c -o p2
$ # run it:
$ ./p2 -f 0x1129 ./p1
87
$

Why this works:

  1. The elf executable layout is optimized for demand paging, so text symbols are at their file offsets providing you aren't doing anything fancy with interpreters and the like.
  2. Your elf executable doesn't make any system, or even library calls; so it doesn't matter that its GOT + PLT aren't hooked up right.

Even changing p1.c:foobar to int v = 87; int foobar() { return v++; } returns 0 instead. But, it only takes a bit of massaging to handle the elf sections, fixup the GOTs and PLTs, and have a generic way to load a program as a weird dll.