I have a phylogenetic tree with many tips and internal nodes. I have a list of node ids from the tree. These are part of a separate table. I want to add a new column to the table, children. To get the descendants (nodes and tips), I am using phangorn::Descendants(tree, NODEID, type = 'all'). I can add length to get the number of descendants. For example,
phangorn::Descendants(tree, 12514, type = 'all')
[1] 12515 12517 12516 5345 5346 5347 5343 5344
length(phangorn::Descendants(tree, 12514, type = 'all'))
[1] 8
I would like to very simply take the column in my dataframe 'nodes', and use the function above length(phangorn::Descendants(tree, 12514, type = 'all')) to create a new column in the dataframe based off the input nodes.
Here is an example:
tests <- data.frame(nodes=c(12551, 12514, 12519))
length(phangorn::Descendants(tree, 12519, type = 'all'))
[1] 2
length(phangorn::Descendants(tree, 12514, type = 'all'))
[1] 8
length(phangorn::Descendants(tree, 12551, type = 'all'))
[1] 2
tests$children <- length(phangorn::Descendants(tree, tests$nodes, type = 'all'))
tests
nodes children
1 12551 3
2 12514 3
3 12519 3
As shown above, the number of children is the length of the data.frame and not the actual number of children calculated above. It should be:
tests
nodes children
1 12551 2
2 12514 8
3 12519 2
If you have any tips or idea on how I can have this behave as expected, that would be great. I have a feeling I have to use apply() or I need to index inside before using the length() function. Thank you in advance.
You're super close! Here's one quick solution using
sapply! There are more alternatives but this one seems to follow the structure of your question!Generating some data
Note that I'm storing all the relevant nodes in the
targetNodesobject. This is equivalent to the following object in your question:Using sapply
Now, let's use
sapplyto repeat the same operation across all the relevant nodes intargetNodes:I'm saving the output of our
sapplyfunction by creating a new column intargetNodes.Good luck!