R Ridgeline plot with multiple PDFs can be overlayed at same level

148 Views Asked by At

How can I create a ridgeline plot where multiple densities can be overlayed at the same ordinate and distinguished by color?

My real dataset is the actual PDF. For a minimum reproducible example, I generate distributions and extract the PDFs to use in a dummy function. The dataframe has a model name (for grouping), x values paired with PDF ordinates, and an id field that separates the different ridgeline levels (i.e., ridgeline y axis).

Make example dataframe

makedfs <- function(name, id, mu, sig) {
  
  vals <- exp(rnorm(1000, mean=mu, sd=sig))
  pdf <-density(vals)
  model <- rep(name, length(pdf$x))
  prox <- rep(id, length(pdf$x))
  df <- data.frame(model, prox, pdf$x, pdf$y)
  colnames(df) <- c("name", "id", "x", "pdf")
  
  return(df)
}

df1 <- makedfs("model1", 0, log(1), 1)
df2 <- makedfs("model2", 0, log(0.5), 2)
df3 <- makedfs("model1", 1, log(0.2), 0.8)
df4 <- makedfs("model2", 1, log(1), 1)

df <- rbind(df1, df2, df3, df4)
head(df,5)

    name id          x          pdf
1 model1  0 -0.6541933 0.0003544569
2 model1  0 -0.5999428 0.0007800386
3 model1  0 -0.5456924 0.0016274229
4 model1  0 -0.4914420 0.0032231582
5 model1  0 -0.4371915 0.0060682580

A quick plot for the first two models looks like this:

plot(df1$x, df1$pdf, type ="l", col=1, xlim=c(-6,6), xlab = "x", ylab = "pdf")
lines(df2$x, df2$pdf, col=2)
legend("topleft", c("df1", "df2"), col = 1:2, lty = 1)

enter image description here

Ridgeline not working

I expected to see the above curves at y=0 on this ridgeline plot, but there is something wrong with the lines and fills for all PDF curves.

library(ggplot2)

p <- ggplot(df, aes(x=x, y=id, height = pdf, group = name, fill = name)) +
  geom_ridgeline(alpha = 0.5, scale = 1) +
  scale_y_continuous(limits = c(0, 5)) +
  scale_x_continuous(limits = c(-6, 6))

enter image description here

How can I produce the expected ridgeline plot?

1

There are 1 best solutions below

0
stefan On BEST ANSWER

IMHO the issue is that you messed up the grouping. Instead of grouping by name you have to group by both name and id using e.g. interaction:

set.seed(123)

library(ggplot2)
library(ggridges)

ggplot(df, aes(x=x, y=id, height = pdf, group = interaction(name, id), fill = name)) +
  geom_ridgeline(alpha = 0.5, scale = .5) +
  scale_y_continuous(limits = c(0, 5)) +
  scale_x_continuous(limits = c(-6, 6))