I would like to use Perf to measure L3 cache accesses/misses and all context switch events, per each thread. No sampling, just the raw counter totals.
However, I prefer to avoid looking up thread IDs whilst the application is running.
I get the impression Perf has a data-collection stage, separate from data visualization stage.
Is it possible to record the statistics first and then show the results per thread?
Do I first run sudo perf record -e [events] ./my_app?
Then afterwards I run sudo perf report but how do I list the thread ids I am interested in, or instruct perf to group by thread id?