I frequently come up with a problem, where I want to modify, say, a column in an array based on information on another array, but I don't know how to do these sort of problems, efficiently avoiding for loops.
To give an example, here is a reproducible R code for a maximization and mass transfer problem using 3D arrays. The code works fine, but the nested loop at the bottom is too slow for my purposes.
The question is, whether there is a way to improve its speed, e.g., by using mapply. I haven't been able to do so this far.
# Initialize parameters
N = 11
I = 3
P = 10
# Initialize some random-ish inputs
prob = runif(P)
prob = prob/sum(prob)
alpha = array(0, c(N,I))
for (i in 1:I) {
alpha[, i] = (1-seq(0, 1, 1/(N-1)))/i
}
f0 = array(runif(N*I), c(N,I))
f0 = f0/(rep(1,N) %*% t(colSums(f0)))
obj = array(runif(N*N*I*P), c(N,N,I,P))
# Initialize the arrays needed in the solution
maxmat = array(0, c(N, I,P))
inds = array(0, c(I*N, 3, P))
p0 = array(0, c(N,I))
arr = array(0, c(N,I))
# Solve a maximization problem
maxmat = apply(obj, c(1,3,4), function(x) which.max(x))
# How do I improve the performance of the below code?
for (i in 1:I) {
for (p in 1:P) {
p_tmp = 0
p_tmp = prob[p]*rowsum(f0[,i], maxmat[, i,p])
p0[as.numeric(attributes(p_tmp)$dimnames[[1]]),i] = p0[as.numeric(attributes(p_tmp)$dimnames[[1]]),i]+ p_tmp[,1]
arr[,i] = arr[,i] + prob[p]*alpha[maxmat[,i, p],i]*f0[,i]
}
}
This loop can be fully vectorized using
outerandrowSums(with thedimsargument) and adjusting themaxmatindices to correspond withi. The vectorized solution is ~18 times faster.Check that the results are the same:
Benchmark with
N = 110,I = 30, andP = 100:A little explanation about
c(fp, numeric(N*I))andc(mm, 1:(I*N)): appendingN*Izeros (whose groups are1:(I*N)) to the end offpensures thatrowsumreturns a vector of lengthN*Iso we don't have to initializep0or mess withdimnames.