Implementation of the categorical instrumental variable estimator.
Arguments
- y
The outcome variable, a numerical vector.
- D
A matrix of endogenous variables.
- Z
A matrix of instruments, where the first column corresponds to the categorical instrument.
- X
An optional matrix of control variables.
- K
The number of support points of the estimated instrument \(\hat{m}_K\), an integer greater than 2.
- regularize_EDZ
A boolean indicating whether \(E[D|Z]\) should be regularized directly rather than \(E[D - X^\top\pi\vert Z]\).
Value
civ
returns an object of S3 class civ
. An object of
class civ
is a list containing the following components:
coef
A vector of second-stage coefficient estimates.
iv_fit
Object of class
ivreg
from the IV regression ofy
onD
andX
using the the estimated \(\hat{F}_K\) as an instrument forD
. See alsoAER::ivreg()
for details.kcmeans_fit
Object of class
kcmeans
from the K-Conditional-Means regression ofD
onZ
andX
. See alsokcmeans::kcmeans()
for details.- K
Pass-through of selected user-provided arguments. See above.
References
Fox J, Kleiber C, Zeileis A (2023). "ivreg: Instrumental-Variables Regression by '2SLS', '2SM', or '2SMM', with Diagnostics". R package.
Wiemann T (2023). "Optimal Categorical Instruments." https://arxiv.org/abs/2311.17021
Examples
# Simulate data from a simple IV model with 800 observations
nobs = 800 # sample size
Z <- sample(1:20, nobs, replace = TRUE) # observed instrument
Z0 <- Z %% 2 # underlying latent instrument
U_V <- matrix(rnorm(2 * nobs, 0, 1), nobs, 2) %*%
chol(matrix(c(1, 0.6, 0.6, 1), 2, 2)) # first and second stage errors
D <- Z0 + U_V[, 2] # endogenous variable
y <- D + U_V[, 1] # outcome variable
# Estimate categorical instrument variable estimator with K = 2
civ_fit <- civ(y, D, Z, K = 3)
summary(civ_fit)
#>
#> Call:
#> AER::ivreg(formula = y ~ D | m_hat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.713529 -0.696460 -0.005709 0.664267 3.179906
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.04118 0.04866 -0.846 0.398
#> D 0.98494 0.07025 14.020 <2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 0.9855 on 798 degrees of freedom
#> Multiple R-Squared: 0.7, Adjusted R-squared: 0.6996
#> Wald test: 196.6 on 1 and 798 DF, p-value: < 2.2e-16
#>