I frequently see sample GLSL code like this:
vec4 sum = texture2D(texture, uv) * 4.0;
sum += texture2D(texture, uv - halfpixel.xy * offset);
sum += texture2D(texture, uv + halfpixel.xy * offset);
sum += texture2D(texture, uv + vec2(halfpixel.x, -halfpixel.y) * offset);
sum += texture2D(texture, uv - vec2(halfpixel.x, -halfpixel.y) * offset);
Wouldn't this introduce a dependency chain in the compiled code? Does it matter even if it does?
Edit:
I hope I summarize this correctly. Modern CPUs and GPUs have schedulers that can run certain instructions simultaneously during the same clock cycle (such as ADD or FADD on Intel). This is only possible if the operations are not part of dependency chains that require them to be separate.
This is an example of a dependency chain, where each addition needs to be calculated before the scheduler can move on to the next statement:
sum2 = a + b;
sum2 += c; // needs to know a+b
sum2 += d; // needs to know sum2+c
sum2 += e; // needs to know sum2+d
sum2 += f; // needs to know sum2+e
This code is equivalent (though it requires at least two additional accumulators and with floats it might give a different answer than the code above) but there are no dependency chains until we get to the last line (there are two of them). So on a CPU/GPU that allows it, the first three lines can be scheduled to execute simultaneously:
a1 = a + b;
a2 = c + d;
a3 = e + f;
sum1 = a1 + a2 + a3;
In the case of the code at the top of my question, I don't know if it matters. If texture look-ups cannot be done in parallel, then probably not. If they can be, then maybe? This is why I'm asking.
Let's consider your addition example:
Now, a question should be asked: what "needs to know a+b"? Because this expression contains several steps. Yes, it does a
+=operation withsum2, but before it can do that, it must first evaluate the expressionscandsum2.Evaluating
cis basically a no-op from the hardware's perspective, but that doesn't matter. After all, the statementc += sum2;would be just as dependent ona+b. This is because evaluatingsum2requires having first evaluated all of the expressions used to generatesum2at that point in the program.It is the evaluation of
sum2that makes the statement dependent, not the evaluation ofc. Technically, if you were doing a pure assignment operation withsum2rather than+=, it wouldn't be dependent (since its overwriting the previous value), so evaluation alone doesn't create dependency, but never mind that now.The point is, when we look at all of those texture expressions, what we can see is that the
texture2dcalls are independent of the summation operations and of each other. As such, the implementation is free to evaluate them in any order it wants.Does that matter? No; texture fetches are not cheap, and reordering these things won't make them any less expensive. The cost of four sequential vector additions is the least of your concerns.