I need to perform mann-whitney test across all genes using R programming. I need to input a text file where first row contains samples, second row contains cohort variables (1 or 2),all other rows contains the gene expressions. This I need to do using function.
The output at the end should be a table with the results: genes in the rows, columns with mean expression of cohort 1, mean expression of cohort 2, FC and mann-whitney p value.
This is what I tried using a demo data but it doesn't seem to be working. I get only G4 as gene in the row and "NAN" without any values in the rows for columns of mean expression of cohort 1, mean expression of cohort 2, FC and mann-whitney p value
data <- read.table(text = "
Cohort Gene S1 S2 S3 S4 S5
1 G1 1389 1097 1501 4630 2011
2 G2 1023 880 492 4411 1233
1 G3 2847 2717 2814 4145 5433
2 G4 20612 18123 17679 4099 8567
", header = TRUE)
#separate cohort 1 and 2
cohort1<-data[data$Cohort != "2", 1]
#head(cohort1)
cohort2<-data[data$Cohort != "1", 1]
geneNames <- data$Gene
row.names(data) <- data$Gene
df <- data.frame()
for (Gene in 1:length(geneNames)){
if (sum(cohort1) | sum(cohort2) > 0){
mwt <- wilcox.test(x = cohort1, y = cohort2, paired = T, exact = F, conf.int = F)
} else if (sum(cohort1) | sum(cohort2) == 0){
mwt <- data.frame("p.value" = NA, "conf.int" = NA)
}
table <- data.frame("Gene" = geneNames[Gene],"Mean_Cohort1" = mean(cohort1),
"Mean_Cohort2" = mean(cohort2),"FC" = mean(cohort1)/mean(cohort2), "MW_Pvalue" = mwt$p.value)
output <- rbind(df, table)
}
Can anybody help me out with this?