Understanding branch prediction and how predictors are selected

98 Views Asked by At

I don't think this is a duplicate, as this question is regarding how to write optimal code to cater to the branch predictor, as well as validating my personal understanding of how it works in general.

So, here's my general understanding of how BP works, and please tell me if any part is incorrect:

When a branch is encountered, the CPU attempts to guess whether it will be taken.

This guess is based on which predictor is used, and predictors base their guess on one or more of the following:

  • The previous times that this branch has/has not been taken (local dynamic)
  • The previous branches-taken (global dynamic)
  • Naive assumption (static prediction)

Also, from what I understand, prediction is based on the instructions themselves, and not any instance data.

As an example of what i mean, please consider the following vector append snippet:

fn append<T>(&mut self, element: T) {
    if self.length == self.capacity {
        self.double_capacity()
    }
    self[self.length] = element
    self.length += 1
}

Whether the if will be taken is guessed based on what happens on any global invocation, rather than the branch history for a given vector instance.

There's also the question of monomorphization, which, to my understanding, would produce multiple compiled versions of this function, effectively permitting a branch predictor for each generic type that is used

Here's my main question regarding BP:

It seems like in this snippet, a static branch predictor would be optimal, since the lowest accuracy would be 1 - 1/(default_vec_size) (about 94% for a default capacity of 16), and only get more accurate as the vectors became larger.

But how does the CPU pick whether to use a static, local dynamic, or global dynamic predictor here?

Is it just based on runtime analysis, or are there some intrinsics introduced by the programmer/compiler to hint at what should be used?

0

There are 0 best solutions below