Skip to contents

Constructs the contrast matrix \(R\) and its influence function matrix inf_func_R for standard DiD aggregation types. The output is designed to be passed directly to lincom.

Usage

lincom_weights_did(
  fit,
  type = c("dynamic", "group", "simple", "calendar"),
  min_e = -Inf,
  max_e = Inf,
  fit_idx = NULL
)

Arguments

fit

A ddml_attgt or ddml_rep object whose underlying fits are ddml_attgt.

type

Aggregation type: "dynamic" (default), "group", "simple", or "calendar".

min_e, max_e

Event-time range filter (dynamic only). Cells with event time outside [min_e, max_e] are excluded.

fit_idx

Integer index of the fit (ensemble type) to use for computing the weighting leverage, or NULL (default) for all ensemble types. When NULL, dinf_dR is a 4D array.

Value

A list with elements:

R

A \((C \times q)\) contrast matrix where \(C\) is the number of GT cells.

inf_func_R

An \((n \times C \times q)\) n x C x q array of influence functions for the contrast matrix \(R\). Slice [,,k] contains the IFs for column \(k\) of \(R\).

dinf_dR

An \((n \times q \times q)\) n x q x q array of weighting leverage. Each slice is the constant matrix \(V'V\) where \(V_{g,k} = \sum_{c \in \mathcal{K}_k,\, g_c = g} (\theta_c - \gamma_k) / S_k\). See D89 §5.3.

labels

Character vector of length \(q\) naming the aggregated quantities.

Details

Let \(\theta^{(g,t)}_0\) denote the GT-ATT from ddml_attgt. Each aggregation type defines a summary parameter as a weighted average of GT-ATTs over a subset of post-treatment cells (\(t \geq g\)).

Dynamic (type = "dynamic"): aggregates by event time \(e = t - g\) (Callaway and Sant'Anna, 2021, eq. 9). For each \(e\):

$$\tau_0(e) = \sum_{g \in \mathcal{G}} \mathbf{1}\{g + e \leq T\}\, \Pr(G = g \mid G + e \leq T)\, \theta^{(g,\, g+e)}_0.$$

Group (type = "group"): aggregates by cohort \(g\). For each \(g\):

$$\theta(g) = \frac{1}{|\mathcal{T}_g|} \sum_{t \in \mathcal{T}_g} \theta^{(g,t)}_0,$$

where \(\mathcal{T}_g = \{t : t \geq g\}\) and the weights reduce to uniform within each cohort.

Calendar (type = "calendar"): aggregates by time period \(t\). For each \(t\):

$$\theta(t) = \sum_{g:\, g \leq t} \frac{P(G = g)}{\sum_{g':\, g' \leq t} P(G = g')}\, \theta^{(g,t)}_0.$$

Simple (type = "simple"): a single weighted average across all post-treatment cells:

$$\theta_{ATT} = \sum_{(g,t):\, t \geq g} \frac{P(G = g)}{\sum_{(g',t'):\, t' \geq g'} P(G = g')}\, \theta^{(g,t)}_0.$$

The influence function for the estimated weights is derived via the quotient rule and passed to lincom as inf_func_R.

References

Callaway B, Sant'Anna P H C (2021). "Difference-in-Differences with multiple time periods." Journal of Econometrics, 225(2), 200-230.

See also

Examples

# \donttest{
set.seed(42)
n <- 200; T_ <- 4
X <- matrix(rnorm(n * 2), n, 2)
G <- sample(c(3, 4, Inf), n, replace = TRUE,
            prob = c(0.3, 0.3, 0.4))
y <- matrix(rnorm(n * T_), n, T_)
fit <- ddml_attgt(y, X, t = 1:T_, G = G,
                learners = list(what = ols),
                sample_folds = 2,
                silent = TRUE)
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
#> Warning: One of the crossfitting subsamples only uses 28 observations for training. Consider increasing ``sample_folds`` if possible.
w <- lincom_weights_did(fit, type = "dynamic")
dyn <- lincom(fit, R = w$R,
              inf_func_R = w$inf_func_R,
              dinf_dR = w$dinf_dR,
              labels = w$labels)
summary(dyn)
#> RAL estimation: Linear Combination 
#> Obs: 200
#> 
#>      Estimate Std. Error z value Pr(>|z|)
#> e=-3  -0.0144     0.3091   -0.05     0.96
#> e=-2  -0.1743     0.1809   -0.96     0.34
#> e=0    0.0443     0.1879    0.24     0.81
#> e=1    0.0890     0.2664    0.33     0.74
# }