geom_abline doesn't recognize scale_color_manual colors

134 Views Asked by At

I have disparate datasets I'm bringing together with ggplot. I can get the colors for lines and legend to work with geom_line, but not geom_abline. For example, this works:

df1 <- data.frame(year=c(1990, 1991, 1992, 1993, 1994),
                  varA=c(10.1, 12.2, 9.7, 11.2, 10.5))

df2 <- data.frame(year=c(1990, 1991, 1992, 1993, 1994),
                  varB=c(900, 780, 889, 910, 820))


cols <- c("varA" = "#D55E00", "varB" = "#0072B2", "varC" = "#D55E00")

g1 <- ggplot() +
  geom_line(data=df1,
            aes(x=year, y=varA, color="varA"), 
            linewidth=1) +
  geom_line(data=df2,
            aes(x=year, y=varB/100, color="varB"), 
            linewidth=1) +
  scale_color_manual(name=NULL,
                     values=cols,
                     labels=c("Variable A", "Variable B")
  )

g1

enter image description here

But when I try to add geom_abline, it doesn't recognize the named color:

varC_coef <- coef(lm(varA*0.2 ~ year, data = df1))

g2 <- ggplot() +
  geom_line(data=df1,
            aes(x=year, y=varA, color="varA"), 
            linewidth=1) +
  geom_line(data=df2,
            aes(x=year, y=varB/100, color="varB"), 
            linewidth=1) +
  geom_abline(intercept=varC_coef[1], 
              slope=varC_coef[2], 
              colour="varC") +
  scale_color_manual(name=NULL,
                     values=cols,
                     labels=c("Variable A", "Variable B", "Variable C")
  )

g2

I get this error:

> g2
Error in `geom_abline()`:
! Problem while converting geom to grob.
ℹ Error occurred in the 3rd layer.
Caused by error:
! Unknown colour name: varC
Run `rlang::last_error()` to see where the error occurred.
> 

Any ideas what the problem/solution is?

TIA

Update:

Applying some of the suggestions so far from comments, I've gotten this far. I changed it to not use scale_color_manual at all and added scale_y_continuous so the geom_abline shows up on the graph. Even though I explicitly set "show.legend = TRUE", it's not showing up in the legend. ??

g2 <- ggplot() +
  geom_line(data=df1,
            aes(x=year, y=varA, color="#D55E00"), 
            linewidth=1) +
  geom_line(data=df2,
            aes(x=year, y=varB/100, color="#0072B2"), 
            linewidth=1) +
  geom_abline(intercept=varC_coef[1], 
              slope=varC_coef[2], 
              colour="black",
              show.legend=TRUE) +
  # scale_color_manual(name=NULL,
  #                    values=cols,
  #                    labels=c("Variable A", "Variable B", "Variable C")
  # ) +
  scale_y_continuous(limits = c(0, 13))

g2

enter image description here

2

There are 2 best solutions below

1
E Maas On BEST ANSWER

It appears that geom_abline isn't the best choice for this particular application. To summarize the very helpful comments, geom_abline neither inherits aesthetics from the plot default nor re-scales the y-axis to include itself if it falls outside the range defined by the earlier geom_lines. Even though no aesthetics were set in the top-level ggplot() line, the inheritance comes from the scale_color_manual statement, which uses the named vector "cols". Geom_abline does not participate in those settings.

In order to get the outcome I needed, I switched geom_abline to geom_smooth. This does inherit aesthetics, contributes to the y-axis range, and allows seamless inclusion into the legend. This code works:

cols <- c("varA" = "#D55E00", "varB" = "#0072B2", "varC" = "#882255")

g3 <- ggplot() +
  geom_line(data=df1,
            aes(x=year, y=varA, color="varA"), 
            linewidth=1) +
  geom_line(data=df2,
            aes(x=year, y=varB/100, color="varB"), 
            linewidth=1) +
  geom_smooth(data=df1,
              aes(color="varC",
                  x=year,
                  y=varA*0.2),
              se=FALSE,
              method=lm) +
  scale_color_manual(name=NULL,
                     values=cols,
                     labels=c("Variable A", "Variable B","Variable C")
  )

g3

enter image description here

0
chemdork123 On

Since color is already mapped in aesthetics with geom_line(), you will need to add color within the aes() function for geom_abline() to make sure you "see" the name in the legend. The problem here is that when you supply either slope or interecept to geom_abline(), any other aesthetics will be ignored.

To get ggplot2 to use the color aesthetic, slope, intercept, and color need to be included inside aes():

geom_abline(data=..., aes(slope=..., intercept=..., color=...))

This means, you should create a new dataframe and apply this to geom_abline() to get this to work.

abdf <- data.frame(
  slope=0,
  intercept=varC_coef[1])

Note that I changed slope to be 0 here because as pointed out in the comments, even with slope at -0.004, the number is below the visible area of the plot and the plot does not automatically scale with geom_abline() since the dataset is effectively understood to be infinity in both x and y directions.

I changed the color in cols so that the line is black. In the code, you'll also want to change key_glyph to "path", because the default value changes the legend key.

cols <- c("varA" = "#D55E00", "varB" = "#0072B2", "varC" = "black")

ggplot() +
  geom_line(
    data=df1, linewidth=1,
    aes(x=year, y=varA, color="varA")) +
  geom_line(
    data=df2, linewidth=1,
    aes(x=year, y=varB/100, color="varB")) +
  geom_abline(
    data=abdf, linewidth=1, key_glyph="path",
    aes(slope=slope, intercept=intercept, color="varC")) +
  scale_color_manual(
    name=NULL, values=cols,
    labels=c("Variable A", "Variable B", "Variable C"))

enter image description here

Bonus: "Grammar of Graphics" Answer

All this aside, the better way to get the plot you are drawing that is more aligned with the philosophy of ggplot2 and the grammar of graphics would be to combine your datasets first, then plot with a single geom_line() command. Full code would be the following:

df2$varB <- df2$varB/100  # doesn't preserve the original data, but more straightforward
df <- merge(df1, df2a)
dfnew <- tidyr::pivot_longer(
  data=df, cols=-year,
  names_to="variable", values_to="val")

varC_coef <- coef(lm(varA*0.2 ~ year, data = df1))
abdf <- data.frame(slope=0, intercept=varC_coef[1], variable="varC")
cols <- c("varA" = "#D55E00", "varB" = "#0072B2", "varC" = "black")

g3 <- ggplot(dfnew, aes(x=year, y=val, color=variable)) +
  geom_line(linewidth=1) +
  geom_abline(
    data=abdf,
    aes(color=variable, slope=slope, intercept=intercept),
    linewidth=1, key_glyph="path") +
  scale_color_manual(
    values=cols,
    labels=c("Variable A", "Variable B", "Variable C"))