Loop/iterate over many columns in R for Cox proportional hazards regression

265 Views Asked by At

I have a great number of events to analyse, all in separate columns as shown in my example below. I would like to create a model for each column labeled IDXX, through a FOR loop, in R.

However, the code always returns the error Time and status are different lengths.

library(survival)
library(survminer)
data = read.csv("data.csv")

# dummy data.csv
#ID  Sex  Age ID1 ID2 ID3 ID4 Time Xfactor
#1    1    55   0   1   0   0   1     12
#2    2    56   0   0   0   0   2     13
#3    2    61   0   0   0   1   3      1
#4    2    62   0   0   1   0   4      3
#5    1    40   0   0   0   0   1      4
 
#   time = time to death
#   event/death = a value of 1 in the individual IDs. Each ID is to be investigated separately, i.e., a different model has to be generated for each ID.
list <- c("ID1", "ID2", "ID3", "ID4")
covariates <- c("Sex", "Age")

for (i in 1:4){
univ_formulas <- sapply(covariates,
                        function(x) as.formula(paste('Surv(Time, ', list[i], ')~', x)))
                        
univ_models <- lapply( univ_formulas, function(x){coxph(x, data = data)})

univ_results <- lapply(univ_models,
                       function(x){ 
                          x <- summary(x)
                          p.value<-signif(x$wald["pvalue"], digits=2)
                          wald.test<-signif(x$wald["test"], digits=2)
                          beta<-signif(x$coef[1], digits=2);#coeficient beta
                          HR <-signif(x$coef[2], digits=2);#exp(beta)
                          HR.confint.lower <- signif(x$conf.int[,"lower .95"], 2)
                          HR.confint.upper <- signif(x$conf.int[,"upper .95"],2)
                          HR <- paste0(HR, " (", 
                                       HR.confint.lower, "-", HR.confint.upper, ")")
                          res<-c(beta, HR, wald.test, p.value)
                          names(res)<-c("beta", "HR (95% CI for HR)", "wald.test", 
                                        "p.value")
                          return(res)
                         })
res <- t(as.data.frame(univ_results, check.names = FALSE))
as.data.frame(res)

res.cox <- coxph(Surv(Time, **list[i]**) ~ Sex + Age + Xfactor, data = data)

summary(res.cox)

survivalsummary <- summary(res.cox)
csvexport <- paste(list[i], ".csv")
write.csv(survivalsummary$coefficients, csvexport)

plotoutput <- ggsurvplot(survfit(res.cox, data = data), palette = "#2E9FDF")
p_plot <- plotoutput$plot
pdfexport <- paste(list[i], ".pdf")
ggsave(pdfexport, device = "pdf")
}

I have a feeling the problematic list[i] as bolded (list[i], I realised the site doesn't parse bolded mono text, but my code does not have the asterisks I've put up here to indicate this) is causing the problem. How should I solve this? I've trawled StackOverflow and other sites to no success at all.

0

There are 0 best solutions below