I would like to create a stacked barplot such that the x-axis is a bin representing the number of unique sample_ids that share a clone, the y-axis is the number of clones in each people bin (I don't really want to show the clone_id, rather the number of clones within each bin category), and each of the clones is actually further divided by the status identity of the cells that make up the clone. For example, if clone A is made up of 10 cells and 4 are status red while 6 are status blue, I'd like to show that by coloring the clone A layer of the stack 40% red and 60% blue. The next clone B layer of a stack might be 66% red and 33% blue, etc. I haven't been able to find an example of what I'm looking for, but if anyone knows of a way to split-color each layer individually of the barplot I'd appreciate the advice!
Data example:
new_df <- structure(list(clone_id = c(101, 101, 101, 101, 102, 102, 103, 103, 103, 103, 104, 104, 104, 104, 104, 104, 104, 104),
sample_id = c(201, 201, 202, 202, 203, 204, 205, 206, 206, 206, 207, 207, 207, 207, 207, 207, 207, 208),
status = c("red", "red", "blue", "blue", "red", "blue", "red", "blue", "blue", "blue", "red", "red", "red", "red", "red", "red", "red", "blue"),
bin_id = c(4, 4, 4, 4, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8, 8, 8),
perc_red = c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.25, 0.25, 0.25, 0.25, 0.875, 0.875, 0.875, 0.875, 0.875, 0.875, 0.875, 0.875),
perc_blue = c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.75, 0.75, 0.75, 0.75, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125)),
class = "data.frame", row.names = c(NA, -18L))
Stacked barplot code:
ggplot(data, aes(fill=clone_id, y=clone_id, x=bin_id)) +
geom_bar(position="stack", stat="identity")
I can make a stacked barplot as shown here without that extra level of detail, and with the y-axis a bit messed up as it's trying to identify the clone_id and I just want the number of clones in the bin stack.
I haven't been able to find any examples of the plot type I'm looking for and am at a loss for how to split the color of each stacked layer.
Edit: I've edited the dataframe to be even simpler above.
I want to show the amount of each layer in the stacked barplot that belongs to 2 set variables (red and blue in this example). I am looking for something like this crude powerpoint example.
What you have described sounds to me like a mosaic plot. I'm not sure what the expected outcome is for the data you have provided (i.e. what the plot is supposed to look like), but here is my best guess:
Example data:
Created on 2024-03-20 with reprex v2.1.0
Is this what you're trying to do? If not, what changes would you make to get the 'right' answer?