Skip to contents

Prediction method for the K-Conditional-Means estimator.

Usage

# S3 method for kcmeans
predict(object, newdata, clusters = FALSE, ...)

Arguments

object

An object of class kcmeans.

newdata

A (sparse) feature matrix where the first column corresponds to the categorical predictor.

clusters

A boolean indicating whether estimated clusters should be returned.

...

Currently unused.

Value

A numerical vector with predicted values (if clusters = FALSE) or predicted clusters (if clusters = FALSE).

References

Wiemann T (2023). "Optimal Categorical Instruments." https://arxiv.org/abs/2311.17021

Examples

# Simulate simple dataset with n=800 observations
X <- rnorm(800) # continuous predictor
Z <- sample(1:20, 800, replace = TRUE) # categorical predictor
Z0 <- Z %% 4 # lower-dimensional latent categorical variable
y <- Z0 + X + rnorm(800) # outcome
# Compute kcmeans with four support points
kcmeans_fit <- kcmeans(y, cbind(Z, X), K = 4)
# Calculate in-sample predictions
fitted_values <- predict(kcmeans_fit, cbind(Z, X))
# Print sample share of estimated clusters
clusters <- predict(kcmeans_fit, cbind(Z, X), clusters = TRUE)
table(clusters)
#> clusters
#>   1   2   3   4 
#> 195 215 199 191