How to compute distance.matrix for the spatialRF::rf_spatial function

135 Views Asked by At

I am using the package spatialRF in R to perform a regression task. From the example provided by the package, the have precomputed the distance.matrix and they use it in the function spatialRF::rf. Here is an example:

library(spatialRF)

#loading training data
data(block.data)

#names of the response variable and the predictors
dependent.variable.name <- "ntl"
predictor.variable.names <- colnames(block.data)[2:4]

#coordinates of the cases
xy <- block.data[, c("x", "y")]

#distance matrix
distance.matrix <- dist(subset(block.data, select = -c(x, y)))

#random seed for reproducibility
random.seed <- 1

model.non.spatial <- spatialRF::rf(
  data = block.data,
  dependent.variable.name = dependent.variable.name,
  predictor.variable.names = predictor.variable.names,
  distance.matrix = distance.matrix,
  distance.thresholds = 0,
  xy = xy,
  seed = random.seed,
  verbose = FALSE)

When running the spatialRF::rf function I am getting this error: Error in diag<-(tmp, value = NA): only matrix diagonals can be replaced

My dataset:

block.data = structure(list(ntl = c(11.4058170318604, 13.7000455856323, 16.0420398712158, 
17.4475727081299, 26.263370513916, 30.658130645752, 19.8927211761475, 
20.917688369751, 23.7149887084961, 25.2641334533691), pop = c(111.031448364258, 
145.096557617188, 166.351989746094, 193.804962158203, 331.787200927734, 
382.979248046875, 237.971466064453, 276.575958251953, 334.015289306641, 
345.376617431641), tirs = c(35.392936706543, 34.4172630310059, 
33.7765464782715, 35.3224639892578, 40.4262886047363, 39.6619148254395, 
38.6306076049805, 36.752326965332, 37.2010040283203, 36.1100578308105
), agbh = c(1.15364360809326, 0.177780777215958, 0.580717206001282, 
0.647109687328339, 3.84336423873901, 5.6310133934021, 2.10894227027893, 
3.9533429145813, 2.7016019821167, 4.36041164398193), lc = c(40L, 
40L, 40L, 126L, 50L, 50L, 50L, 50L, 40L, 50L)), class = "data.frame", row.names = c(NA, 
-10L))

For reference, in the example in the link I provided, the distance matrix and the dataset the authors are using it's the same.

1

There are 1 best solutions below

0
Nikos On

The solution:

wd = "path/"

block.data = read.csv(paste0(wd, "block.data.csv"))

#names of the response variable and the predictors
dependent.variable.name <- "ntl"
predictor.variable.names <- colnames(block.data)[4:7]

#coordinates of the cases
xy <- block.data[, c("x", "y")]

block.data$x <- NULL
block.data$y <- NULL

#distance matrix
distance.matrix <- as.matrix(dist(block.data))
min(distance.matrix) # here I am searching the min dist so I can set the thresholds below
max(distance.matrix)

#distance thresholds (same units as distance_matrix)
distance.thresholds <- c(0, 20, 30, 50, 70)