How to add a linear fit to the ggplot graph, when the axis are logarithmic?

78 Views Asked by At

I have this following data frame and I calculate a fit line (p) based on logarithmic values of X and Y.

library(ggplot2)

df <- data.frame (Xvalue=c(0.05, 1, 300, 500, 800),
                 Yvalue=c(90, 100, 103, 105, 92))

fit_df <- lm(log(Yvalue) ~ log(Xvalue), data = df)
p <- coef(fit_df)
Standarderror <- summary(fit_df )$coefficients

Next I want to show the scatter plot of the data, as well as the fitted line, in a logarithmic scale for both X and Y axis. However, this does not work, as the following code just generates the scatter plot and not the fit line.

ggplot() +
  
  scale_x_log10() +
  scale_y_log10() +
  geom_point(data = df, aes(x = Xvalue, y = Yvalue), colour = "blue", size = 5) +
  geom_abline(intercept =  95.5375152316, slope = 0.007552  , color = "red") 

enter image description here

There should be a problem in the geom_abline, because as I remove the logarithmic scale of the graph, it shows the line (of course the line here is not correct, but the point is that ggplot can show it).

ggplot() +

  geom_point(data = df, aes(x = Xvalue, y = Yvalue), colour = "blue", size = 5) +

geom_abline(intercept =  95.5375152316, slope = 0.007552  , color = "red") 

enter image description here

Now I have a couple of questions regarding this code:

  1. What's the problem with geom_abline? How can I should the fit line in the logarithmic axis?
  2. How can I add ribbons to the fit line, to show how well my data is around the fit line? I assume standard error of the fit as added to below and above the line, right?
  3. General question: I want to repeat this procedure for a couple of more datasets, and show the final result in one graph. For this purpose, is it better to seperate the data (as I did now) and then just add different layers in ggplot or is there a smarter way?
1

There are 1 best solutions below

1
Allan Cameron On

You don't need to precalculate the intercept and slope. Instead, use geom_smooth, being sure you set method = 'lm'

The easiest way to do this is probably to plot log(Xvalue) against log(Yvalue). That way, your formula for the lm is simply y ~ x. To give yourself logarithmic axes, you can make breaks at the log of the values you want to see on each axis, and simply label them as the non-logged values.

library(ggplot2)

df <- data.frame (Xvalue=c(0.05, 1, 300, 500, 800),
                  Yvalue=c(90, 100, 103, 105, 92))

ggplot(df, aes(log(Xvalue), log(Yvalue))) + 
  geom_point(size = 5, colour = 'blue') + 
  geom_smooth(formula = y ~ x, 
              method = 'lm', color = 'red', linewidth = 0.5) +
  scale_x_continuous('Xvalue', breaks = log(c(0.1, 1, 10, 100, 1000)),
                     labels = c(0.1, 1, 10, 100, 1000)) +
  scale_y_continuous('Yvalue', breaks = log(seq(70, 115, 5)),
                     label = seq(70, 115, 5))

enter image description here