I am trying to prototype a way to display a tree data structure (or, at least a work-around version of it) with a dendrogram in R which represents a history workflow. I've gotten the dendrogram to work, however, is it possible to display customized information about each of the nodes/where it branches?
Here is my code:
library(plotly)
library(ggplot2)
library(ggdendro)
library(shiny)
find_lca <- function(parents, depth, node1, node2) {
# Find the depths of the nodes
depth1 <- depth[node1]
depth2 <- depth[node2]
# Make sure node1 is at a higher depth
if (depth1 < depth2) {
temp <- node1
node1 <- node2
node2 <- temp
}
# Adjust the depth of node1
depth_diff <- depth1 - depth2
while (depth_diff > 0) {
node1 <- parents[node1]
depth_diff <- depth_diff - 1
}
# Check if node1 and node2 are already the same
if (node1 == node2) {
return(node1)
}
# Move up both nodes until they have a common parent
while (parents[node1] != parents[node2]) {
node1 <- parents[node1]
node2 <- parents[node2]
}
# Return the lowest common ancestor
return(parents[node1])
}
calculate_distance <- function(parent.of.index, distance.from.root, child_a_index, child_b_index) {
# Calculate the distance between root and n1
dist_n1_root <- distance.from.root[[child_a_index]]
# Calculate the distance between root and n2
dist_n2_root <- distance.from.root[[child_b_index]]
# Calculate the distance between n1 and n2
dist_n1_n2 <- distance.from.root[[find_lca(parent.of.index, distance.from.root, child_a_index, child_b_index)]]
# Calculate the final distance using the formula
final_distance <- dist_n1_root + dist_n2_root - 2 * dist_n1_n2
return(final_distance)
}
state <- list(1, 2, 3, 4, 5)
actions = list( list(label = "Action 1", variables = state[1]),
list(label = "Action 2", variables = state[2]),
list(label = "Action 3", variables = state[3]),
list(label = "Action 4", variables = state[4]),
list(label = "Action 5", variables = state[5])
)
parent.of.index = c(-1, 1, 1, 3, 3)
depth = c(0, 1, 1, 2, 2)
dendro_data <- data.frame(
child_b = c()
)
for (i in 2:length(actions)) {
for (j in 1:(i-1)) {
child_a <- i
child_b <- j
#child_b is ALWAYS smaller/higher/earlier than child_a
distance <- calculate_distance(parent.of.index, depth, child_a, child_b)
dendro_data[i, j] = distance
}
}
dendro_data <- dendro_data[-1, ]
colnames(dendro_data) <- state[-length(state)]
temp = as.vector(na.omit(unlist(dendro_data)))
NM = unique(c(colnames(dendro_data), row.names(dendro_data)))
mydist = structure(temp, Size = length(NM), Labels = NM,
Diag = FALSE, Upper = FALSE, method = "euclidean", #Optional
class = "dist")
model <- hclust(mydist)
dhc <- as.dendrogram(model)
data <- dendro_data(dhc, type = "triangle")
p <- ggplot(segment(data)) +
geom_segment(aes(x = x, y = y, xend = xend, yend = yend)) +
scale_y_reverse(expand = c(0.2, 0)) +
theme_dendro()
ggplotly(p)
Yes the code is messy I'm aware, I just cobbled it together as fast as possible for prototyping.
The way the data is processed is that I have a list which contains the index of the parent of the node at the current index (which is labeled parent), and there is a list of the actions that the current node has taken, which contains a dataframe actions made up of the label for the current action label and the actions which have made up the history so far. For simplicity's sake, the "history so far" is just a list of numbers named state
Currently, the plot renders like this:

If possible, I want to render the label and the state variable in the black box when I hover over it, how can I do that?
Data:
Created on 2024-03-22 with reprex v2.0.2