I just need some feedback on a very simple code I made, it's also saying Error: unexpected '}' in "}", and I wonder if this is a unicode issue? Any feedback would be appreciated!!
My data is here: joint.sexdiff.csv
I'm a really really new coder, and R is the one I want to use for future stats, so I'm constantly learning and appreciate all the feedback!
# Statistical test pipeline when trying to compare two groups ----
# Description: When trying to compare two groups, t-test is often used. Unpaired two-samples t-test can be used only under certain conditions:
# (1) when the two groups of samples (A and B), being compared, are normally distributed
# This can be checked using Shapiro-Wilk test
# (2) When the variances of the two groups are equal
# This can be checked using F-test
# Given a sample dataframe, a pipeline will be used to automatically determine whether a parametric or non-parametric test should be used, and get the p-value
# Packages to install and call
install.packages("dplyr")
install.packages("ggplot2")
library(dplyr)
library(ggplot2)
# Sample data named "joint.sexdiff.csv"
data1<-read.csv("*path*")
# Some descriptive stats
group_by(data1, Sex) %>%
summarise(
count=n(),
mean=mean(JS.W, na.rm=TRUE)
)
# Data visualization using ggplot2
bplot<-ggplot(
data1, aes(x=Sex, y=JS.W, colour=Sex))+
geom_boxplot(outlier.shape = NA)+
geom_jitter()+
ylab("Joint space width (mm)")+
xlab("")+
ggtitle("Male vs Female joint space width (JSW)")
bplot
# Now we have the data we need, we can run a two group comparison
# Use Shapiro-Wilk test to test if the joint space widths fall within a normal distribution, if p>0.05, we can assume normality
# If the distribution is not normal, we need to use a non-parametric test
# Non-parametric unpaired two-samples Wilcoxon test is used to compare two independent groups of samples that are not normally distributed
# http://www.sthda.com/english/wiki/unpaired-two-samples-wilcoxon-test-in-r
# The process is as follows:
# Shapiro test -> if normal distribution or p>0.05, do F-test -> if variances are equal or p>0.05, var.equal=TRUE will be assumed on the t-test
# -> if variances are unequal or p<0.05, var.equal=FALSE will be assumed
# IMPORTANT -- your t-test p-values will change depending on whether you assume the variances to be equal or not
# Shapiro -> if distribution is not normally distributed or p<0.05, use Wilcoxon test
# The test blocks of code will check your data and run an independent samples t-test or Wilcoxon test depending on whether sample is normally distributed and variances are normal
# Output = p.value
shapJSW<-shapiro.test(data1$JS.W)
if (shapJSW$p.value>0.05){
ftestJSW<-var.test(JS.W ~ Sex, data = data1)
if (ftestJSW$p.value>0.05){
ttestJSW<-t.test(JS.W ~ Sex, data = data1, var.equal = TRUE)
print(ttestJSW$p.value)
} else {
ttestJSW<-t.test(JS.W ~ Sex, data = data1, var.equal = FALSE)
print(ttestJSW$p.value)
} else {
wtestJSW<-wilcox.test(JS.W ~ Sex, data = data1)
print(wtestJSW)
wtestJSW<-wilcox.test(JS.W ~ Sex, data = data1)
print(wtestJSW)
}
}
Sprinkle some of your wisdom to me! :) <3
The error is a missing
}
after printing the p-value of the t-test with unequal variances. The testing part can be re-written like this:Usage:
Output: