I know benchmarking is a very delicate subject and simple, not-well-thought-out benchmarks are mostly meaningless for performance comparisons, but what I have right now is actually a pretty small and contrived example that I think should be easily explainable. So, even if the question seems unhelpful, it would at least help me in understanding benchmarking.
So, here I go.
I was trying to experiment with simple API design in C, using run-time polymorphism kind of behaviour via void *. Then I compared it with same thing implemented in C++ using regular virtual functions. Here is the code:
#include <cstdlib>
#include <cstdio>
#include <cstring>
int dummy_computation()
{
return 64 / 8;
}
/* animal library, everything is prefixed with al for namespacing */
#define AL_SUCCESS 0;
#define AL_UNKNOWN_ANIMAL 1;
#define AL_IS_TYPE_OF(animal, type) \
strcmp(((type *)animal)->animal_type, #type) == 0\
typedef struct {
const char* animal_type;
const char* name;
const char* sound;
} al_dog;
inline int make_dog(al_dog** d) {
*d = (al_dog*) malloc(sizeof(al_dog));
(*d)->animal_type = "al_dog";
(*d)->name = "leslie";
(*d)->sound = "bark";
return AL_SUCCESS;
}
inline int free_dog(al_dog* d) {
free(d);
return AL_SUCCESS;
}
typedef struct {
const char* animal_type;
const char* name;
const char* sound;
} al_cat;
inline int make_cat(al_cat** c) {
*c = (al_cat*) malloc(sizeof(al_cat));
(*c)->animal_type = "al_cat";
(*c)->name = "garfield";
(*c)->sound = "meow";
return AL_SUCCESS;
}
inline int free_cat(al_cat* c) {
free(c);
return AL_SUCCESS;
}
int make_sound(void* animal) {
if(AL_IS_TYPE_OF(animal, al_cat)) {
al_cat *c = (al_cat*) animal;
return dummy_computation();
} else if(AL_IS_TYPE_OF(animal, al_dog)) {
al_dog *d = (al_dog*) animal;
return dummy_computation();
} else {
printf("unknown animal\n");
return 0;
}
}
/* c style library finishes here */
/* cpp library with OOP */
struct animal {
animal(const char* n, const char* s)
:name(n)
,sound(s)
{}
virtual int make_sound() {
return dummy_computation();
}
const char* name;
const char* sound;
};
struct cat : animal {
cat()
:animal("garfield", "meow")
{}
};
struct dog : animal {
dog()
:animal("leslie", "bark")
{}
};
/* cpp library finishes here */
I have something called dummy_computation, just to make sure I get some computational thingy going on in the benchmark. I would normally implement different printf calls for barking, meowing etc. for such an example but printf is not easily benchmarkable in quick-benchmarks.com. The actual thing I want to benchmark is run-time polymorphism implementation. So that's why I chose to make some small function and used it in both C and C++ implementation as a filler.
Now, in quick-benchmarks.com, I have a benchmark like following:
static void c_style(benchmark::State& state) {
// Code inside this loop is measured repeatedly
for (auto _ : state) {
al_dog* d = NULL;
al_cat* c = NULL;
make_dog(&d);
make_cat(&c);
int i1 = make_sound(d);
benchmark::DoNotOptimize(i1);
int i2 = make_sound(c);
benchmark::DoNotOptimize(i2);
free_dog(d);
free_cat(c);
}
}
// Register the function as a benchmark
BENCHMARK(c_style);
static void cpp_style(benchmark::State& state) {
for (auto _ : state) {
animal* a1 = new dog();
animal* a2 = new cat();
int i1 = a1->make_sound();
benchmark::DoNotOptimize(i1);
int i2 = a2->make_sound();
benchmark::DoNotOptimize(i2);
delete a1;
delete a2;
}
}
BENCHMARK(cpp_style);
I added DoNotOptimize calls so that virtual calls would not end up being optimized-out.
Whole benchmark can be found here, if recreating it seems painful.
https://quick-bench.com/q/ezul9hDXTjfSWijCfd2LMUUEH1I
Now, to my surprise, C version comes out 27 times faster in the results. I expected maybe some performance hits on C++ version because it is a more refined solution but definitely not 27-fold.
Can someone explain these results? Do virtual function calls really incur this much overhead compared to C? Or is it the way I set up this benchmarking experiment that is completely meaningless? If so, how would one more correctly benchmark such issues?
It's because you're not implementing the same thing. If you do an
if-chain ofswitch-chain in C, then you have (mathematically) a discriminated union, which isstd::variantin C++.If you'd like the C++ version to be ported to C, then you need function pointers. It'll very likely be equally slow. The reason behind,
virtualmeans forward compatible: any code, including a library loaded later, can descend from your base and implement thevirtualmethods. It means, sometimes you don't even know at compile-time of your base module what (descendant) classes it might need to handle (the type system is open). Such forward compatibility is not provided forstd::variant, which is closed (limited to a fixed list of types).