How to perform a correlation test in a loop, while removing a row from the dataframe in every iteration?

53 Views Asked by At

I am trying to remove a row from my dataframe in every iteration in a for loop and perform correlation test on the newly saved dataframe. However, I am not getting what I expect. Please help. Each row in the dataframe provided represents corresponding column name data of an individual.

rnpo <- data.frame(h.move.ten = c(25.85, 51.375, 26.007, 35.249, 30.841), move.ten = c(3.231, 0.000, 4.334, 4.745, 0.000), reor.ten = c(0.000, 3.626, 1.181, 2.027, 2.457), hbob.ten = c(3.398, 17.934, 7.050, 1.075, 0.943))

store.cor <- numeric(nrow(rnpo))

for (i in 1:nrow(rnpo)) {
  droprow <- rnpo[-i,]
  store.cor[i] <- cor(droprow)
}

This is the code that I am trying to use.

Alternatively, I am trying to use:

store.cor <- numeric(nrow(rnpo))
data.ind <- 1:nrow(rnpo) 
store.cor <- sapply(data.ind, function(x) cor(rnpo[-x]))
calc.cor <- function(x,vec) {
  cor(vec[-x])
}
store.cor <- sapply(data.ind, calc.cor, vec=rnpo)
store.cor

Here, my columns are getting dropped in every iteration instead of my rows. How to fix this problem?

1

There are 1 best solutions below

1
L Tyrone On

As you haven't provided a sample dataset, here is a generic solution that you can adapt to your needs. This approach creates a copy of your dataset to iteratively remove rows from, and an empty list to store results:

# set.seed() only needed for making a reproducible example df, 
# you won't need it for your full dataset
set.seed(1)

# Create example dataframe
rnpo <- data.frame(a = 1:20,
                   b = sample(1:20, 20))

# Create copy of your data to remove rows from
droprow <- rnpo
# Create empty list to store results 
store.cor <- list()

# Use while() to iterate over "droprow" and perform cor()
while(nrow(droprow) > 1) {
  
  store.cor <- rbind(store.cor, cor(droprow))
  droprow <- droprow[-1,]
  
}

head(store.cor)
  a         b        
a 1         0.3383459
b 0.3383459 1        
a 1         0.2690311
b 0.2690311 1        
a 1         0.2281195
b 0.2281195 1

Note: I placed the code that drops the rows last in the while() loop so that cor() is run on the full dataset as well. You can switch the order if this is incorrect. Further, the rownames() in "store.cor" have a numeric value appended so you can identify each calculation. These numbers do not show when using head().