Hierarchical Clustering for binary variables

200 Views Asked by Lisa Guo At 07 June 2023 at 07:23

I have a dataset of 2,000 rows and 60 columns (dummies). The dummies are survey-like questions.

I'd like to apply hierarchical clustering to identify different types of profiles according to answers to questions.

I've heard about the hamming distance and jaccard distance : which one is better ? How about the linkage method ? Ward method only works with euclidean distance. Finally, how to choose the correct number of clusters ?

I've output a dendogram but doesn't know where to cut to choose the correct number of clusters.

I am expecting to retrieve around 5 clusters and be able to interpret them. I thought about first doing a simple logistic regression on each cluster. Then studying the shap values perhaps.

Original Q&A

Hierarchical Clustering for binary variables

There are 0 best solutions below

Related Questions in CLUSTER-ANALYSIS

Related Questions in LOGISTIC-REGRESSION

Related Questions in HIERARCHICAL-CLUSTERING

Related Questions in DUMMY-VARIABLE

Related Questions in HAMMING-DISTANCE

Trending Questions

Popular # Hahtags

Popular Questions