Is there false-sharing when writing to unique elements of an output array?

33 Views Asked by At

Is there false-sharing with sum[] in the following snippet of code that computes the row-wise sums of a sparse CSR matrix since independent threads update distinct locations of the array which could potentially be mapped to the same cache-line?

If yes, how can we avoid this assuming sum[] is pre-allocated and cannot be re-defined to have elements map to unique cachelines.

#pragma omp parallel for  
for (int i = 0; i < N; i++)
    float row_sum = 0.;
    for (int k = rowOffsets[i]; k < rowOffsets[i+1]; k++){
      row_sum += values[k]; 
    }
    sum[i] = row_sum;
}
1

There are 1 best solutions below

2
PierU On

This is precisely a case of potential false sharing. However it's really not bad, as 1) this is just a single write at the end of each outer iteration, and 2) the default omp scheduling will group the iterations by large chunks for each thread, hence minimizing the cache line conflicts.

You could reduce further the false sharing by delaying the moment you write to sum[] until the end of each thread:

for (int i = ; i < N; i++) sum[i] = 0.f;
#pragma omp parallel 
{
    float* tmp = (float*)malloc(N*sizeof(float));
    for (int i = ; i < N; i++) tmp[i] = 0.f;
    #pragma omp for  
    for (int i = 0; i < N; i++)
        for (int k = rowOffsets[i]; k < rowOffsets[i+1]; k++){
            tmp[i] += values[k]; 
        }
    }
    #pragma omp critical
    for (int i = ; i < N; i++) sum[i] += tmp[i];
    free(tmp)
}

But frankly you don't need this kind of complication in the above case. It can help in some other cases, though.