Trace all threads of a non-child process reliably and transparently with ptrace

79 Views Asked by At

I am writing a small utility that needs to trace all existing threads of another running, non-child process. Besides, I would like to achieve two things in the tracer:

  1. Avoid racing against the tracee process on thread creation and termination ("reliably").
  2. Avoid being noticed by the real parent of the tracee process ("transparently"). For example, I do not want to trigger the shell job control.

I wonder if there exists such a way? I have tried two things but neither works fully.

Thanks!


Things I tried

1. trace then stop

The first one is:

  • Attach to the thread group leader (thread id == process id) with ptrace(2);
  • Send a SIGSTOP to the thread with tgkill(2);
  • Wait for the signal-delivery stop and inject the SIGSTOP;
  • Wait for the group stop, at which point I assume all threads of the process should stop executing;
  • Walk the '/proc/PID/task' directory and attach to all the other threads;
  • Resume the leader thread with some PTRACE_* request.

However, in this way the real parent of the tracee will be notified about the SIGSTOP. Here is an example program:

/* tracer.c 
   compile with: cc -std=c99 -o tracer tracer.c */

#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <errno.h>

#define QUOTE(x) #x
#define STR(x) QUOTE(x)
#define die(s)                                  \
    do {                                        \
        perror("[" STR(__LINE__) "] " s);       \
        exit(1);                                \
    } while (0)

int
main(int argc, char **argv)
{
    pid_t pid = strtol(argv[1], NULL, 10);
    int opts = PTRACE_O_TRACESYSGOOD
        | PTRACE_O_TRACECLONE
        | PTRACE_O_TRACEFORK
        | PTRACE_O_TRACEVFORK;
    int status, r, sig, event;

    if (ptrace(PTRACE_SEIZE, pid, 0, 0) < 0) die("ptrace");
    if (tgkill(pid, pid, SIGSTOP) < 0) die("tgkill");
    while ((r = waitpid(pid, &status, __WALL)) > 0) {
        sig = WSTOPSIG(status);
        event = status >> 16;
        printf("[" STR(__LINE__) "] event: %3d, signal: %3d\n", event, sig);
        if (!event) {
            /* signal delivery stop */
            if (ptrace(PTRACE_CONT, pid, 0, sig) < 0) die("ptrace");
        } else if (event == PTRACE_EVENT_STOP && sig == SIGSTOP) {
            /* desired group-stop */
            if (ptrace(PTRACE_SETOPTIONS, pid, 0, opts) < 0) die("ptrace");
            break;
        } else {
            /* fallback case */
            if (ptrace(PTRACE_CONT, pid, 0, 0)) die("ptrace");
        }
    }
    if (r < 0) die("waitpid");
    /* attach to threads here... */
    if (ptrace(PTRACE_SYSCALL, pid, 0, 0) < 0) die("ptrace");
    while (1) {
        if (waitpid(-1, &status, __WALL) < 0) {
            if (errno == ECHILD) return 0;
            die("waitpid");
        }
        sig = WSTOPSIG(status);
        event = status >> 16;
        printf("[" STR(__LINE__) "] event: %3d, signal: %3d\n", event, sig);
        if (ptrace(PTRACE_SYSCALL, pid, 0, 0) < 0) {
            if (errno == ESRCH) return 0;
            die("ptrace");
        }
    }
}

On my Debian 11 system with Linux kernel 5.10.0-26-amd64, if I run in a shell (Bash in my case):

sleep 60

and in another shell:

./tracer <PID of sleep>

I can observe the desired signal delivery and group stop in the output:

[34] event:   0, signal:  19  <- signal delivery
[34] event: 128, signal:  19  <- group stop
[57] event:   0, signal: 133
[57] event:   0, signal: 133
...

However, the sleep process will be put into background in the shell, so it does not work "transparently".

2. stop and trace

This is the approach I am currently using. Basically:

  • Send a SIGSTOP to the thread group leader of the tracee;
  • Attach to the leader thread and stop it (with PTRACE_SEIZE and PTRACE_INTERRUPT);
  • Attach to all the other threads;
  • Send a SIGCONT to the leader thread.

Honestly this is just some random pieces glued together, but it appears to work well in my experiments. I am including it here because maybe it could be a reasonable start for a proper solution, also in the hope of getting some clarifications.

I am especially confused by the signals and events seen in the tracer, as shown with the following example.

Here is a simplified program for this approach (the full version I am using in my utility program is available here):

/* tracer2.c 
   compile with: cc -std=c99 -o tracer2 tracer2.c */

#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <errno.h>

#define QUOTE(x) #x
#define STR(x) QUOTE(x)
#define die(s)                                  \
    do {                                        \
        perror("[" STR(__LINE__) "] " s);       \
        exit(1);                                \
    } while (0)

int
main(int argc, char **argv)
{
    pid_t pid = strtol(argv[1], NULL, 10);
    int opts = PTRACE_O_TRACESYSGOOD
        | PTRACE_O_TRACECLONE
        | PTRACE_O_TRACEFORK
        | PTRACE_O_TRACEVFORK;
    int status, r, sig, event;

    if (tgkill(pid, pid, SIGSTOP) < 0) die("tgkill");
    if (ptrace(PTRACE_SEIZE, pid, 0, opts) < 0) die("ptrace");
    if (ptrace(PTRACE_INTERRUPT, pid, 0, 0) < 0) die ("ptrace");
    /* attach to threads here... */
    if (tgkill(pid, pid, SIGCONT) < 0) die("tgkill");
    while (1) {
        if (waitpid(-1, &status, __WALL) < 0) {
            if (errno == ECHILD) return 0;
            die("waitpid");
        }
        sig = WSTOPSIG(status);
        event = status >> 16;
        printf("[" STR(__LINE__) "] event: %3d, signal: %3d\n", event, sig);
        int req = PTRACE_SYSCALL;
        switch (event) {
        case 0:
            sig = sig == (SIGTRAP | 0x80) ? 0 : sig;
            break;
        case PTRACE_EVENT_STOP:
            switch (sig) {
            case SIGSTOP:
            case SIGTSTP:
            case SIGTTIN:
            case SIGTTOU:
                req = PTRACE_LISTEN;
            }
            sig = 0;
            break;
        default:
            sig = 0;
        }
        if (ptrace(req, pid, 0, 0) < 0) {
            if (errno == ESRCH) return 0;
            die("ptrace");
        }
    }
}

If I run:

./tracer2 <PID of sleep>

This time the shell would not be notified of the SIGSTOP (at least not in my experiments), but I doubt this would work reliably because there is no synchronization between the tracee and the tracer on the stopped state, and I do not know why the tracee would not notify its parent this time.

Moreover, I get the following output:

[41] event: 128, signal:   5  <- PTRACE_INTERRUPT
[41] event:   0, signal:  18  <- SIGCONT delivery    
[41] event:   0, signal: 133         
[41] event:   0, signal: 133
...

So it seems that the process has stopped before PTRACE_INTERRUPT. I tried a few times but the results were the same. I am not sure if that is just by coincidence. (If that is always true, then I guess I can just wait for the PTRACE_INTERRUPT to ensure the stopped state of all threads.)

Besides, what also confuses me is that if I did not send SIGCONT, I got the following output:

[41] event:   0, signal:  19  <- SIGSTOP delivery
[41] event: 128, signal:   5  <- PTRACE_INTERRUPT
[41] event:   0, signal: 133
[41] event:   0, signal: 133
...

This time we can see the SIGSTOP, which have never happened in my previous experiments with the SIGCONT. I am not sure if this is just how signals work in Linux.

Anyway, this approach appears to work well, but the behaviors are very mysterious to me, and I am not confident that this really works.

0

There are 0 best solutions below