Difference-in-Differences Aggregation Weights for lincom
Source:R/lincom_weights.R
lincom_weights_did.RdConstructs the contrast matrix \(R\) and
its influence function matrix inf_func_R for
standard DiD aggregation types. The output is
designed to be passed directly to
lincom.
Usage
lincom_weights_did(
fit,
type = c("dynamic", "group", "simple", "calendar"),
min_e = -Inf,
max_e = Inf,
fit_idx = NULL
)Arguments
- fit
A
ddml_attgtorddml_repobject whose underlying fits areddml_attgt.- type
Aggregation type:
"dynamic"(default),"group","simple", or"calendar".- min_e, max_e
Event-time range filter (dynamic only). Cells with event time outside
[min_e, max_e]are excluded.- fit_idx
Integer index of the fit (ensemble type) to use for computing the weighting leverage, or
NULL(default) for all ensemble types. WhenNULL,dinf_dRis a 4D array.
Value
A list with elements:
RA \((C \times q)\) contrast matrix where \(C\) is the number of GT cells.
inf_func_RAn \((n \times C \times q)\) n x C x q array of influence functions for the contrast matrix \(R\). Slice
[,,k]contains the IFs for column \(k\) of \(R\).dinf_dRAn \((n \times q \times q)\) n x q x q array of weighting leverage. Each slice is the constant matrix \(V'V\) where \(V_{g,k} = \sum_{c \in \mathcal{K}_k,\, g_c = g} (\theta_c - \gamma_k) / S_k\). See D89 §5.3.
labelsCharacter vector of length \(q\) naming the aggregated quantities.
Details
Let \(\theta^{(g,t)}_0\) denote the GT-ATT
from ddml_attgt. Each aggregation type
defines a summary parameter as a weighted average of
GT-ATTs over a subset of post-treatment cells
(\(t \geq g\)).
Dynamic (type = "dynamic"): aggregates
by event time \(e = t - g\) (Callaway and Sant'Anna,
2021, eq. 9). For each \(e\):
$$\tau_0(e) = \sum_{g \in \mathcal{G}} \mathbf{1}\{g + e \leq T\}\, \Pr(G = g \mid G + e \leq T)\, \theta^{(g,\, g+e)}_0.$$
Group (type = "group"): aggregates by
cohort \(g\). For each \(g\):
$$\theta(g) = \frac{1}{|\mathcal{T}_g|} \sum_{t \in \mathcal{T}_g} \theta^{(g,t)}_0,$$
where \(\mathcal{T}_g = \{t : t \geq g\}\) and the weights reduce to uniform within each cohort.
Calendar (type = "calendar"): aggregates
by time period \(t\). For each \(t\):
$$\theta(t) = \sum_{g:\, g \leq t} \frac{P(G = g)}{\sum_{g':\, g' \leq t} P(G = g')}\, \theta^{(g,t)}_0.$$
Simple (type = "simple"): a single
weighted average across all post-treatment cells:
$$\theta_{ATT} = \sum_{(g,t):\, t \geq g} \frac{P(G = g)}{\sum_{(g',t'):\, t' \geq g'} P(G = g')}\, \theta^{(g,t)}_0.$$
The influence function for the estimated weights is
derived via the quotient rule and passed to
lincom as inf_func_R.
References
Callaway B, Sant'Anna P H C (2021). "Difference-in-Differences with multiple time periods." Journal of Econometrics, 225(2), 200-230.
Examples
# \donttest{
set.seed(42)
n <- 200; T_ <- 4
X <- matrix(rnorm(n * 2), n, 2)
G <- sample(c(3, 4, Inf), n, replace = TRUE,
prob = c(0.3, 0.3, 0.4))
y <- matrix(rnorm(n * T_), n, T_)
fit <- ddml_attgt(y, X, t = 1:T_, G = G,
learners = list(what = ols),
sample_folds = 2,
silent = TRUE)
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
w <- lincom_weights_did(fit, type = "dynamic")
dyn <- lincom(fit, R = w$R,
inf_func_R = w$inf_func_R,
dinf_dR = w$dinf_dR,
labels = w$labels)
summary(dyn)
#> RAL estimation: Linear Combination
#> Obs: 200
#>
#> Estimate Std. Error z value Pr(>|z|)
#> e=-3 -0.0144 0.3091 -0.05 0.96
#> e=-2 -0.1743 0.1809 -0.96 0.34
#> e=0 0.0443 0.1879 0.24 0.81
#> e=1 0.0890 0.2664 0.33 0.74
# }