I have a Google Benchmark such as the following.
#include "benchmark/benchmark.h"
#include <cstring>
static void bench_memset(benchmark::State& state) {
char buffer[16];
for(auto _ : state) {
memset(buffer, '\0', 16);
benchmark::ClobberMemory();
}
}
BENCHMARK(bench_memset);
BENCHMARK_MAIN();
And I run it with the following command.
./my_benchmark --benchmark_perf_counters=BRANCH-MISSES,CACHE-MISSES,CACHE-REFERENCES --benchmark_counters_tabular=true
Which results in the following data.
---------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations BRANCH-MISSES CACHE-MISSES CACHE-REFERENCES
---------------------------------------------------------------------------------------------------
bench_memset 0.611 ns 0.611 ns 1000000000 7n 1000p 52n
I generally understand the concept of branch and cache misses, but I don't understand the meaning of the 'n' and 'p' that are printed after the perf counter metrics.
I searched both Google Benchmark's documentation and Perf's documentation, but neither seem to mention this.
I also notice that if I bump up the size of my workload,
static void bench_memset(benchmark::State& state) {
char buffer[4096];
for(auto _ : state) {
memset(buffer, '\0', 4096);
benchmark::ClobberMemory();
}
}
Then the 'n' will change to 'u', which is also mysterious.
---------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations BRANCH-MISSES CACHE-MISSES CACHE-REFERENCES
---------------------------------------------------------------------------------------------------
bench_memset 96.0 ns 96.0 ns 6959398 1.14952u 0 81.6163u
I've also noticed with other benchmarks that there may be no letter at all.
What do these letters stand for?
--benchmark_perf_counterslists additional perf counters to collect, in libpfm format. Thus, the information aboutu,pandncan be found in the manual of the perfmon2 project. It shows hardware counters, and Benchmarks outputs fractions of an overall test duration:In other words, these values are probabilities.
IMHO It would be better if they show just counters, fractions could be easy computed like 7 / 1e9 or 568 / 6959398.