I have a dataset with observations of variables of mixed type. I would like to use the observations to group the mixed type variables into clusters.
I have tried the following, but because I transpose the data frame, it is difficult to distinguish which variables are numeric/factor variables...
What is the best way to approach this problem?
Thanks
#install.packages("clustMixType")
library(clustMixType)
# Create a hypothetical dataset
set.seed(42) # Ensure reproducibility
hypothetical_dataset <- data.frame(
scale(rnorm(100)), scale(rnorm(100)), scale(rnorm(100)),
scale(rnorm(100)), scale(rnorm(100)),
factor(sample(0:1, 100, replace = TRUE)),
factor(sample(0:1, 100, replace = TRUE)),
factor(sample(0:1, 100, replace = TRUE))
)
names(hypothetical_dataset) <- c(
"Numeric_Var1", "Numeric_Var2", "Numeric_Var3",
"Numeric_Var4", "Numeric_Var5",
"Binary_Factor_Var1", "Binary_Factor_Var2", "Binary_Factor_Var3"
)
str(hypothetical_dataset)
# Transpose dataset, as I would like to use observations to group the predictors
transposed_dataset <- t(hypothetical_dataset)
transposed_dataset_df <- as.data.frame(transposed_dataset)
rownames(transposed_dataset_df) <- c("Numeric_Var1", "Numeric_Var2", "Numeric_Var3",
"Numeric_Var4", "Numeric_Var5", "Binary_Factor_Var1",
"Binary_Factor_Var2", "Binary_Factor_Var3")
kproto(transposed_dataset_df , k = 6, diss = "gower")