Branch on null vs null object performance

286 Views Asked by At

Which is most efficient: using a null object, or a branch on nullptr. Example in C++:

void (*callback)() = [](){}; // Could be a class member

void doDoStuff()
    {
    // Some code
    callback();  // Always OK. Defaults to nop
    // More code
    }

vs

void (*callback)() = nullptr; // Could be a class member

void doDoStuff()
    {
    // Some code
    if(callback != nullptr)  // Check if we should do something or not
        {callback();}
    // More code
    }

The null object will always do an indirect function call, assuming the compiler cannot inline it. Using a nullptr, will always do branch, and if there is something to do, it will also do an indirect function call.

Would replacing callback with a pointer to an abstract base class affect the decision?

What about the likelihood of callback set to something other than nullptr. I guess that if callback is most likely nullptr, then it is faster with the additional branch, right?

3

There are 3 best solutions below

1
Acorn On BEST ANSWER

Which is most efficient: using a null object, or a branch on nullptr.

Assuming a given codegen solution, it will depend on your target. And, if you don't assume that, it will also depend on your code, compiler and probably moon phase.

Would replacing callback with a pointer to an abstract base class affect the decision?

In most scenarios that will end up being also dynamic dispatch.

What about the likelihood of callback set to something other than nullptr. I guess that if callback is most likely nullptr, then it is faster with the additional branch, right?

That would depend on the ratio and also on the hardware (whether a branch is faster or not compared to a indirect function call; which also depends on the ratio).


Just to make you aware of how subtle things are: let's assume you are talking about modern x86_64 desktop processors with the call not being inlined and with a likelihood of 99% of the time being nullptr. People told you to measure.

So you did and, in this case, let's assume you found out the branch seems faster. Great! So we should start using branches everywhere, right?

Not really, because your branch predictor will have likely made the code effectively free (branch never taken), which means you were not even comparing branches vs. indirect calls at all.

0
Xirema On

You have to test the code and find out

There's simply no way to know for certain whether this code is faster or slower without testing it—and the results of that test might not be applicable to your specific application or environment, even if the test has the same compiler and target as your build.

Compilers are very good at optimizing code, especially with respect to behavior that is semi-predictable, and CPUs are often built with Branch Prediction or other similar technologies that reduce the costs of having to branch code. So the answer depends pretty exclusively on the specific use-case of your code, the target device for your code, and the conditions under which it's operating.

0
Nicol Bolas On

None of these questions can be answer a priori. They will depend on a myriad of factors, including but not limited to, the specifics of the hardware, the specifics of the cache at the time of the call, the various code around such a call which may hide any latencies introduced, the quality of the compiler's inlining/devirtualization power (which again, depends on the specific details of all the code) and so forth.

Micro-optimizations of this sort should only be attempted when you have the exact code you intend to optimize, as well as certain knowledge that the code in question is a performance problem worthy of such optimization.