How to create a dissimilarity matrix with daisy function in R?

Question

How to create a dissimilarity matrix with daisy function in R?

2.6k Views Asked by Joep_S At 21 January 2020 at 08:19

I want to perform a cluster analysis with the pam function in R, using daisy to create a dissimilarity matrix. My data contains 2 columns (ID and Disease). Both are factors with a lot of values (400 and 1800 respectively). How can I create the dissimilarity matrix I need to cluster the data using pam?

Example data frame:

set.seed(1)
df <- data.frame(ID = rep(sample(c("a","b","c","d","e","f","g"),10,replace = TRUE),70),
                 disease = sample(c("flu","headache","pain","inflammation","depression","infection","chest pain"),100,replace = TRUE))

df <- unique(df)

Can I run the daisy function on this data frame or do I have to convert it into another format?

Original Q&A

There are 1 best solutions below

**jay.sf** · Accepted Answer · 2020-01-21T08:56:45.580000

Since "Dissimilarities will be computed between the rows of x" (?daisy), you may want to run daisy on the table of your data frame.

(df.tab <- table(df))
#    disease
# ID  chest pain depression flu headache infection inflammation pain
#   a          1          1   1        1         1            1    1
#   b          1          1   1        1         1            1    1
#   c          1          1   0        0         1            1    1
#   d          1          1   1        0         1            0    1
#   e          0          1   1        1         1            1    0
#   f          0          1   1        1         1            0    1
#   g          1          1   1        1         1            1    0 

library(cluster)    
daisy(df.tab, metric="euclidean")
# Dissimilarities :
#   a        b        c        d        e        f
# b 0.000000                                             
# c 1.414214 1.414214                                    
# d 1.414214 1.414214 1.414214                           
# e 1.414214 1.414214 2.000000 2.000000                  
# f 1.414214 1.414214 2.000000 1.414214 1.414214         
# g 1.000000 1.000000 1.732051 1.732051 1.000000 1.732051
# 
# Metric :  euclidean 
# Number of objects : 7

How to create a dissimilarity matrix with daisy function in R?

There are 1 best solutions below

Related Questions in R

Related Questions in R-DAISY

Trending Questions

Popular # Hahtags

Popular Questions