R big.matrix operators

53 Views Asked by At

I am after little help trying to find the particular position of a value within a big.matrix object. Take the following matrix:

X <- as.big.matrix(matrix(1:30, 10, 3))

I want to find the row and column number where a particular value occurs, say find_numb = 13 (this will be later used to subset against another matrix). On a standard matrix I do:

as.matrix(X) == find_numb #Convert big.matrix and find location of value

Which returns a TRUE/FALSE matrix which is great.

Now, when I do the same on a big.matrix X == find_numb I get the following error:

Error in X == find_numb : 
  comparison (1) is possible only for atomic and list types

This seems like a simple problem but I do not fully understand the error (still learning R / programming in general), so I apologize for not understanding these atomic and list definitions to solve myself.

The above example is of course a simplified example: the actual matrix is around 500 GB (hence big.matrix), and I want to search through a vector of different numbers to find their individual locations e.g.

find_numb <- sample(1:10000, 2000). 

I have the the mcapply function drafted to do this, just striking this issue when trying to find the initial locations of each value.

Thank you for any help and guidance

1

There are 1 best solutions below

0
F. Privé On BEST ANSWER

With R package {bigstatsr}, you can do:

library(bigstatsr)
X <- as_FBM(matrix(sample(30, 5000, replace = TRUE), 50, 100))

# tuto for big_apply(): https://privefl.github.io/bigstatsr/articles/big-apply.html
test <- big_apply(X, function(X, ind) {
  res <- which(X[, ind, drop = FALSE] == 13, arr.ind = TRUE)
  res[, 2] <- ind[res[, 2]]
  res
}, a.combine = "rbind")

test2 <- which(X[] == 13, arr.ind = TRUE)
all.equal(test, test2)

Disclaimer: I'm the author of the package.