How to choose optimal k for clustering of mixed variables?

33 Views Asked by Ying At 11 January 2024 at 12:39

I've been utilizing Gower distance for clustering mixed variables, encompassing both numerical and categorical variables, through hierarchical clustering. Apart from determining k using the dendrogram, is there a method to find the optimal k using within-cluster sum of squares (WSS)?

I've employed the 'pam' function to identify the optimal k with average silhouette width, but the value keeps increasing. Are there other functions that can utilize a dissimilarity matrix to calculate within-cluster sum of squares (WSS)?

the code of the 'pam' function

sil_width <- c() 
for (i in 2:20) {  
      sil_width[i] <- pam(gower_distance, diss = TRUE, k = i)$silinfo$avg.width
}

Original Q&A

How to choose optimal k for clustering of mixed variables?

There are 0 best solutions below

Related Questions in R

Related Questions in CLUSTER-ANALYSIS

Related Questions in PAM

Trending Questions

Popular # Hahtags

Popular Questions