I am running a Monte-Carlo experiment and evaluate the absolute loss function described below. Since it is very computationally intensive I would like to optimise my code and improve the speed further. My main code is in MATLAB, but I am evaluating the function in C using the MEX functionality of MATLAB.
The mathematical problem is as follows: I have a matrix D with dimensions (M times N). Usually M is around 20,000 and N takes values around {10, 30, 144}.
Effectively, I need to obtain L column vector with dimensions (M times 1) defined as
My C function looks like this:
void absolute_loss(double *D, double *L, mwSize cols, mwSize rows)
{
double aux;
int i;
int j;
int k;
for (i = 0; i < rows; i++) {
for (j = 0; j < rows; j++){
aux = 0;
for (k = 0; k < cols; k++) {
aux = aux + fabs(D[j + rows * k] - D[i + rows * k]);
}
L[i] = L[i] + aux;
}
}
for (i = 0; i < rows; i++) {
L[i] /= rows;
}
}
Enable compiler optimizations @Jesper Juhl.
If able, use
floattypes andfloatfunctions. Sometimes up to 4x faster. For me, 8% faster.Use
restrictto let compiler know referenced data does not overlap. Otherwise compiler must assumeL[i] = ...;may changeD[]and that prevents some optimizations.For referenced data, use
constwhere able.Use consistent indexing types.
Change index increment. @DevSolar
Index type: For me
size_tandunsignedabout the same.unsigned shortwas 5% faster.Notes:
I'd expect the following, or the like, at the start of the function.
Tip, rather than
rows, cols, i, j, useM, N, m, nto match the formula. I am not sure you have it right.Candidate re-write that takes advantage of variable length arrays and sample usage:
My time: 4.906 seconds.