In R, I have a data.frame showing the distance between pairs of nodes:
dl <- data.frame(
a = c('a','a','a','b','b','c'),
b = c('b','c','d','c','d','d'),
dist = c(1,2,3,2,1,2)
)
I want to convert this to a distance matrix, with the diagonal set to zero and the upper triangle set to NA, since the distances are symmetrical:
dm <- data.frame(
a = c(0,2,3,2),
b = c(NA, 0, 2, 1),
c = c(NA, NA, 0, 2),
d = c(NA, NA, NA, 0),
row.names = c('a','b','c','d')
) %>% as.matrix()
My real data is very large, so computational efficiency is key. The only solution I can come up with myself involves either looping or using igraph to first convert the list to a graph, and then converting that graph to matrix, and thats not really ideal given the size of my data. The input is a data.frame since node-ids are text while distances are numeric, and the desired output is a matrix since speed is key.
Here are some base R options
Use
xtabswhich gives
Use
as.distwhich gives
Use
matrixwhich gives