GLSL clamps indices on array access

18 Views Asked by At

In GLSL I iterate over a float buffer, that contains the coordinates and a couple properties of elements to render. I was curious about shaders (don't have much experience with them) and wanted to obsessively optimise it. When looking at how it gets compiled (I'm using WebGL+Spector.js), I notice that in the loop where I access the array it clamps the accesses to the size of the buffer.

I understand this is heavily machine dependent, and is done to ensure no out of bounds accesses, but is there not a way to avoid these checks or guarantee to the compiler out of bounds accesses aren't possible (eg. adding a condition to the loop)? I'm mostly curious about the (small) performance impact these operations have (6 clamps, and int + float casts per iteration!)

Any other potential optimisation tips are welcome, I'm very new to this and super interested in it. I thought about maybe passing the data as an array of vec3s instead to reduce array accesses to 2/element. Not sure if it would improve it though !

Original code:

// maxRelevantIndex is a uniform, elementSize is a const = 6, elements is of size 60
for (int i = 0; i < maxRelevantIndex; i += elementSize) {
    float x1 = elements[i];
    float y1 = elements[i + 1];
    float x2 = elements[i + 2];
    float y2 = elements[i + 3];
    float br = elements[i + 4];
    float color = elements[i + 5];
    ...
}

Decompiled:

for (int _uAV = 0;
(_uAV < _uU);
(_uAV += 6)) {
    float _uk = _uV[int(clamp(float(_uAV), 0.0, 59.0))];
    float _ul = _uV[int(clamp(float((_uAV + 1)), 0.0, 59.0))];
    float _um = _uV[int(clamp(float((_uAV + 2)), 0.0, 59.0))];
    float _un = _uV[int(clamp(float((_uAV + 3)), 0.0, 59.0))];
    float _uAW = _uV[int(clamp(float((_uAV + 4)), 0.0, 59.0))];
    float _uAC = _uV[int(clamp(float((_uAV + 5)), 0.0, 59.0))];
    ...
}

Update/Attemps:

I've tried both changing the indexing to use uints, and adding && i < 60 in the loop condition, neither got rid of the clamp/cast. I did end up converting my data to an array of vec4s which gave a small perf improvement though, although I don't know if it's due to the avoided clamp/casts, or simply due to less array lookups being used.

0

There are 0 best solutions below