I am writing a compute shader that runs some function for every pixel in a framebuffer. To simplify this question, I'm using a workgroup size of 1.
@compute
@workgroup_size(1, 1)
fn main(@builtin(global_invocation_id) gid: vec3u) { ... }
Dispatching via pass.dispatch_workgroups(screen_width, screen_height, 1).
That means my main function is called with gid = (x, y, 1) for all 0 <= x < screen_width and 0 <= y < screen_height. Specifically, the bottom right pixel will have gid = (screen_width - 1, screen_height - 1, 1), right?
I have a depth buffer from a previous pass. I want to convert those depth values to world coordinates. And for that I need to convert gid into clip space coordinates. Here is what I want to do:
let depth = textureLoad(depth_buffer, gid.xy, 0);
let clip_space = ?????;
let world_space_hc = inverse_view_proj_matrix * vec4(clip_space, depth, 1.0);
let world_space = world_space_hc.xyz / world_space_hc.w;
What to write in place of ???? is my question! My naive approach was:
let uv = vec2f(gid.xy) / screen_size;
let clip_space = vec2(uv.x, 1.0 - uv.y) * 2.0 - 1.0;
The second line is correct, I think. And this seems to work. But I am worried that this is slightly incorrect. In particular, the bottom right pixel with (screen_width - 1, screen_height - 1) will NOT map to (1, 1) with this algorithm. And that seems wrong.
I've been reading the WebGPU spec (in particular the coordinate system rasterization parts), but I'm still not 100% clear. I think that corner pixels "look along" edges of the view frustum exactly. This would mean the correct conversion is:
let uv = vec2f(gid.xy) / (screen_size - vec2f(1.0);
Is that correct?
A third option I could see is:
let uv = (vec2f(gid.xy) + vec2f(0.5)) / screen_size;
I'd greatly appreciate an answer finally explaining this to me.
I think it's easiler to visualize texture coordinates (And clip space coordinates) with a small example. Imagine you have a 3x2 texture (or canvas)
These are the clip space coordinates (assuming your viewport setting matches the texture size (the default)
Looking further, the coordinates of each texel are
(Compare [the spec on rasterization], which specifies the pixels center as the relevant points:
So, assuming your depth texture covers the entire clip space then
whether or not you flip Y is up to you. There's no inherent mapping from invocation ids to anything.