Clustering in R with K-prototype - Transposing Data frame Issue?

45 Views Asked by At

I have a dataset with observations of variables of mixed type. I would like to use the observations to group the mixed type variables into clusters.

I have tried the following, but because I transpose the data frame, it is difficult to distinguish which variables are numeric/factor variables...

What is the best way to approach this problem?

Thanks

#install.packages("clustMixType")
library(clustMixType)


# Create a hypothetical dataset 

set.seed(42) # Ensure reproducibility
hypothetical_dataset <- data.frame(
  scale(rnorm(100)), scale(rnorm(100)), scale(rnorm(100)), 
  scale(rnorm(100)), scale(rnorm(100)),
  factor(sample(0:1, 100, replace = TRUE)),
  factor(sample(0:1, 100, replace = TRUE)),
  factor(sample(0:1, 100, replace = TRUE))
)
names(hypothetical_dataset) <- c(
  "Numeric_Var1", "Numeric_Var2", "Numeric_Var3", 
  "Numeric_Var4", "Numeric_Var5",
  "Binary_Factor_Var1", "Binary_Factor_Var2", "Binary_Factor_Var3"
)
str(hypothetical_dataset)


#  Transpose dataset, as I would like to use observations to group the predictors 
transposed_dataset <- t(hypothetical_dataset)
transposed_dataset_df <- as.data.frame(transposed_dataset)
rownames(transposed_dataset_df) <- c("Numeric_Var1", "Numeric_Var2", "Numeric_Var3", 
                                     "Numeric_Var4", "Numeric_Var5", "Binary_Factor_Var1", 
                                     "Binary_Factor_Var2", "Binary_Factor_Var3")


kproto(transposed_dataset_df , k = 6, diss = "gower")

0

There are 0 best solutions below