Implementation of the categorical instrumental variable estimator.
Arguments
- y
 The outcome variable, a numerical vector.
- D
 A matrix of endogenous variables.
- Z
 A matrix of instruments, where the first column corresponds to the categorical instrument.
- X
 An optional matrix of control variables.
- K
 The number of support points of the estimated instrument \(\hat{m}_K\), an integer greater than 2.
- regularize_EDZ
 A boolean indicating whether \(E[D|Z]\) should be regularized directly rather than \(E[D - X^\top\pi\vert Z]\).
Value
civ returns an object of S3 class  civ. An object of
class civ is a list containing the following components:
coefA vector of second-stage coefficient estimates.
iv_fitObject of class
ivregfrom the IV regression ofyonDandXusing the the estimated \(\hat{F}_K\) as an instrument forD. See alsoAER::ivreg()for details.kcmeans_fitObject of class
kcmeansfrom the K-Conditional-Means regression ofDonZandX. See alsokcmeans::kcmeans()for details.- K
 Pass-through of selected user-provided arguments. See above.
References
Fox J, Kleiber C, Zeileis A (2023). "ivreg: Instrumental-Variables Regression by '2SLS', '2SM', or '2SMM', with Diagnostics". R package.
Wiemann T (2023). "Optimal Categorical Instruments." https://arxiv.org/abs/2311.17021
Examples
# Simulate data from a simple IV model with 800 observations
nobs = 800 # sample size
Z <- sample(1:20, nobs, replace = TRUE) # observed instrument
Z0 <- Z %% 2 # underlying latent instrument
U_V <- matrix(rnorm(2 * nobs, 0, 1), nobs, 2) %*%
  chol(matrix(c(1, 0.6, 0.6, 1), 2, 2)) # first and second stage errors
D <- Z0 + U_V[, 2] # endogenous variable
y <- D + U_V[, 1] # outcome variable
# Estimate categorical instrument variable estimator with K = 2
civ_fit <- civ(y, D, Z, K = 3)
summary(civ_fit)
#> 
#> Call:
#> AER::ivreg(formula = y ~ D | m_hat)
#> 
#> Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -2.713529 -0.696460 -0.005709  0.664267  3.179906 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) -0.04118    0.04866  -0.846    0.398    
#> D            0.98494    0.07025  14.020   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.9855 on 798 degrees of freedom
#> Multiple R-Squared:   0.7,	Adjusted R-squared: 0.6996 
#> Wald test: 196.6 on 1 and 798 DF,  p-value: < 2.2e-16 
#>