Constructing a new ML Wrapper
Source:vignettes/articles/example_keras_wrapper.Rmd
example_keras_wrapper.Rmd
Introduction
Load keras
package.
Construct keras wrapper. The prediction methods for keras already outputs a numerical vector. It is thus not necessary to also construct a wrapper for the prediction method.
The first part of the wrapper constructs a neural network architecture. Below is a simple example using the relu activation function and potential ell_1 regularization in the hidden or output layers. This part of the wrapper should be adjusted for more complicated architectures.
mdl_keras <- function(y, X,
units = 10, nhidden = 1, lambda1 = 0, lambda2 = 0,
optimizer_fun = "rmsprop",
loss = "mse",
epochs = 10,
batch_size = min(1000, length(y)),
validation_split = 0,
callbacks = NULL,
steps_per_epoch = NULL,
metrics = c("mae"),
verbose = 0) {
# ============================================================================
# ADJUST FOR DIFFERENT ARCHITECTURES =========================================
# Construct neural network architecture
nnet <- keras_model_sequential()
for (k in 1:nhidden) {
nnet <- nnet %>%
layer_dense(units = units, use_bias = T,
activation = "relu",
kernel_regularizer = regularizer_l1(l = lambda1))
}#FOR
nnet <- nnet %>%
layer_dense(units = 1, use_bias = T,
kernel_regularizer = regularizer_l1(l = lambda2))
# ============================================================================
# ADJUST FOR DIFFERENT ARCHITECTURES =========================================
# Compile model
nnet %>% keras::compile(optimizer = optimizer_fun,
loss = loss,
metrics = metrics)
# Fit neural net
nnet %>% keras::fit(X, y,
epochs = epochs,
batch_size = batch_size,
validation_split = validation_split,
callbacks = callbacks,
steps_per_epoch = steps_per_epoch,
verbose = verbose)
# Amend class
class(nnet) <- c("mdl_keras", class(nnet))
# Return fit
return(nnet)
}#MDL_KERAS
predict.mdl_keras <- function(obj, newdata = NULL){
# Check for new data
#if(is.null(newdata)) newdata <- obj$X
# Predict data and output as matrix
class(obj) <- class(obj)[-1] # Not a pretty solution...
as.numeric(predict(obj, newdata))
}#PREDICT.MDL_KERAS
Let’s test this on a simple empirical example. I consider the BLP_1995 data and, for simplicity, estimate a partially linear model (rather than a partially linear iv model).
y <- log(BLP_1995$share) / log(BLP_1995$outshr)
D <- BLP_1995$price
X <- cbind(BLP_1995$space, BLP_1995$hpwt)
Next, we need to specify the particular neural network learner we want to use. In addition to the architecture, it’s important to properly tune the optimization algorithm. callback_list below helps with specifying learning rate adjustments and defines the early stopping rule.
callbacks_list <- list(callback_early_stopping(monitor = "val_loss",
patience = 15,
restore_best_weights = T),
callback_reduce_lr_on_plateau(monitor = "val_loss",
factor = 1/10,
patience = 10,
verbose = F),
callback_learning_rate_scheduler(
function(epoch, learning_rate){
if(epoch == 0) learning_rate <- 0.1
return(learning_rate)
}))
learners = list(what = mdl_keras,
args = list(units = 10, nhidden = 1, lambda1 = 0, lambda2 = 0,
epochs = 50,
verbose = F,
validation_split = 0.1,
callbacks = callbacks_list))
We can now run the ddml estimator. Note that because we have set verbose to TRUE in the above list of arguments, live training output is automatically plotted. This is very helpful for assessing whether the defined training parameters are sensible. Since it substantially slows down the training process, however, verbose should be set to FALSE whenever you’re not planning to actively monitor the output.
<- ddml_plm(y, D, X, learners = learners)
plm_fit #> DDML estimation in progress.
#>
|X]: sample fold 1/2
E[Y|X]: sample fold 2/2 -- Done!
E[Y#>
|X]: sample fold 1/2
E[D|X]: sample fold 2/2 -- Done!
E[D#> DDML estimation completed.
$coef
plm_fit#> D_r
#> 1.632659
= list(list(fun = mdl_keras,
learners args = list(units = 10, nhidden = 1, lambda1 = 0, lambda2 = 0,
epochs = 50,
verbose = F,
validation_split = 0.1,
callbacks = callbacks_list)),
list(fun = mdl_keras,
args = list(units = 5, nhidden = 2, lambda1 = 0, lambda2 = 0,
epochs = 50,
verbose = F,
validation_split = 0.1,
callbacks = callbacks_list)))
<- ddml_plm(y, D, X, learners = learners, shortstack = TRUE)
plm_fit #> DDML estimation in progress.
#>
|X]: sample fold 1/2
E[Y|X]: sample fold 2/2 -- Done!
E[Y#>
|X]: sample fold 1/2
E[D|X]: sample fold 2/2 -- Done!
E[D#> DDML estimation completed.
$coef
plm_fit#> D_r
#> 0.8318073