Why is the confidence interval different when using tidy() on output of aggte() than what aggte() itself displays?

Question

Why is the confidence interval different when using tidy() on output of aggte() than what aggte() itself displays?

168 Views Asked by tiny At 19 April 2023 at 16:13

I am working with the did package performing an estimation using the conditional parallel trends assumption with only one treated group (and 7 control groups that are never treated), so only one treatment adoption time.

I use the aggte() function with type = "simple" to compute the ATT. R then displays confidence bands for the MP-object. If I however try to access the estimate, standard error and the confidence bands with the function tidy() on the same MP-object, tidy() returns different confidence bands. I tried setting type= "group", but still the confidence bands are different.

Any help is appreciated!

library(did)
data(mpdta)
mw.attgt.X <- att_gt(yname = "lemp",
gname = "first.treat",
idname = "countyreal",
tname = "year",
xformla = ~lpop,
data = mpdta,
)

mw.attgt.X

aggte(mw.attgt.X, type = "simple")
tidy(aggte(mw.attgt.X, type = "simple"))


aggte(mw.attgt.X, type = "group")
tidy(aggte(mw.attgt.X, type = "group"))

You can see the discrepancy in the confidence intervals.

Original Q&A

There are 1 best solutions below

**thus__** · Answer 1 · 2023-04-20T14:00:01.177000

The tidy function is one that originally comes from the package {broom} that now lives in the tidymodels universe. It's goal is to tidy-up (hence broom, for sweeping and cleaning) model objects. The model object is reshaped into a data frame so that, say you're doing this programmatically with a 100 models they can be squished together into a dataframe and analysed together.

It looks like the did package reexports the tidy generic function from broom and adds its own methods. Taking a look at the source code of the tidy method can be quite informative. The function definition is quite verbose so I'm going to truncate it to what really matters for your use case.

did:::tidy.AGGTEobj
#> function (x, ...) 
#> {
#>     . . . TRUNCATED . . .
#>     if (x$type == "simple") {
#>         out <- data.frame(type = x$type, estimate = x$overall.att, 
#>             std.error = x$overall.se, conf.low = x$overall.se - 
#>                 stats::qnorm(1 - x$DIDparams$alp/2) * x$overall.se, 
#>             conf.high = x$overall.se + stats::qnorm(1 - x$DIDparams$alp/2) * 
#>                 x$overall.se, point.conf.low = x$overall.se - 
#>                 stats::qnorm(1 - x$DIDparams$alp/2) * x$overall.se, 
#>             point.conf.high = x$overall.se + stats::qnorm(1 - 
#>                 x$DIDparams$alp/2) * x$overall.se)
#>     }
#>     out
#> }
#> <bytecode: 0x116f88558>
#> <environment: namespace:did>

^{Created on 2023-04-20 with reprex v2.0.2}

Note how conf.low and conf.high are calculated. Compare this to how the AAGTEobj is printed itself. After some sleuthing you can see that the did:::summary.AGGTEobj function is called. Again, truncating for brevity.

Here you can see how the confidence bands are calculated. I'm no expert in difference in difference but it looks like the way that they are calculated for both methods are different.

did:::summary.AGGTEobj
#> function (object, ...) 
#> {
#>     . . .  TRUNCATED . . . 
#>     pointwise_cval <- qnorm(1 - alp/2)
#>     overall_cband_upper <- object$overall.att + pointwise_cval * 
#>         object$overall.se
#>     overall_cband_lower <- object$overall.att - pointwise_cval * 
#>         object$overall.se
#>     out1 <- cbind.data.frame(object$overall.att, object$overall.se, 
#>         overall_cband_lower, overall_cband_upper)
#>     out1 <- round(out1, 4)
#>     overall_sig <- (overall_cband_upper < 0) | (overall_cband_lower > 
#>         0)
#>     overall_sig[is.na(overall_sig)] <- FALSE
#>     overall_sig_text <- ifelse(overall_sig, "*", "")
#>     out1 <- cbind.data.frame(out1, overall_sig_text)
#>     . . . TRUNCATED . . .
#>         cband_text1a <- paste0(100 * (1 - object$DIDparams$alp), 
#>             "% ")
#>         cband_text1b <- ifelse(object$DIDparams$bstrap, ifelse(object$DIDparams$cband, 
#>             "Simult. ", "Pointwise "), "Pointwise ")
#>         cband_text1 <- paste0("[", cband_text1a, cband_text1b)
#>         cband_lower <- object$att.egt - object$crit.val.egt * 
#>             object$se.egt
#> 
#>     . . . TRUNCATED . . . 
#>     }
#> }
#> <bytecode: 0x1107b1758>
#> <environment: namespace:did>

^{Created on 2023-04-20 with reprex v2.0.2}

Personally, I think the difference is worth a bug report.

Why is the confidence interval different when using tidy() on output of aggte() than what aggte() itself displays?

There are 1 best solutions below

Related Questions in R

Related Questions in TIDY

Related Questions in BROOM

Trending Questions

Popular # Hahtags

Popular Questions