I wanted to generate some predicted probabilities and I was recommended two commands: ggeffects::ggpredict() and margin effects::predictions(). However, some of the results seem similar (the conf.high) and other results seem different (the conf.low). What is going on here? Any insights would be helpful.
Here's an example:
logit <- glm(vs ~ mpg + cyl + disp + hp + drat + wt + qsec + am + gear + carb, mtcars, family = "poisson")
library(ggeffects)
ggpredict(logit, terms=c("mpg")) %>% as_tibble()
library(marginaleffects)
predictions(logit, by=c("mpg")) %>% as_tibble()
The two functions do different things by default.
ggpredict()with thetermsargument computes predictions for each row of a data frame wherempgtakes on values between 10 and 32, and all other variables are held at their means, includingam, which is set to 0.41 even if it is presumably binary/categorical.predictions()with thebyargument computes predictions for each row in the actually observed data frame, and then takes the average of these predictions for each unique value ofmpg.In other words,
ggpredict()reports predictions for “synthetic” data — hypothetical units which are exactly average along all dimensions except one.predictions()reports averages of predictions made on actually observed data.It is easy to reproduce the same results as
ggpredict()withpredictions(), by using thenewdataargument and thedatagrid()function:PS: this is not a logit model.