Skip to contents

ddml is an implementation of double/debiased machine learning estimators as proposed by Chernozhukov et al. (2018). The key feature of ddml is the straightforward estimation of nuisance parameters using (short-)stacking (Wolpert, 1992), which allows for multiple machine learners to increase robustness to the underlying data generating process. See also Ahrens et al. (2024) for a detailed illustration of the practical benefits of combining DDML with (short-)stacking.

ddml is the sister R package to our Stata package, mirroring its key features while also leveraging R to simplify estimation with user-provided machine learners and/or sparse matrices. See also Ahrens et al. (2023) with additional discussion of the supported causal models and benefits of (short)-stacking.

Installation

Install the latest development version from GitHub (requires devtools package):

if (!require("devtools")) {
  install.packages("devtools")
}
devtools::install_github("thomaswiemann/ddml", dependencies = TRUE)

Install the latest public release from CRAN:

Example: LATE Estimation based on (Short-)Stacking

To illustrate ddml on a simple example, consider the included random subsample of 5,000 observations from the data of Angrist & Evans (1998). The data contains information on the labor supply of mothers, their children, as well as demographic data. See ?AE98 for details.

# Load ddml and set seed
library(ddml)
set.seed(75523)

# Construct variables from the included Angrist & Evans (1998) data
y = AE98[, "worked"]
D = AE98[, "morekids"]
Z = AE98[, "samesex"]
X = AE98[, c("age","agefst","black","hisp","othrace","educ")]

ddml_late estimates the local average treatment effect (LATE) using double/debiased machine learning (see ?ddml_late). Since the statistical properties of machine learners depend heavily on the underlying (unknown!) structure of the data, adaptive combination of multiple machine learners can increase robustness. In the below snippet, ddml_late estimates the LATE with short-stacking based on three base learners:

# Estimate the local average treatment effect using short-stacking with base
#     learners ols, rlasso, and xgboost.
late_fit_short <- ddml_late(y, D, Z, X,
                            learners = list(list(fun = ols),
                                            list(fun = mdl_glmnet),
                                            list(fun = mdl_xgboost,
                                                 args = list(nrounds = 100,
                                                             max_depth = 1))),
                            ensemble_type = 'nnls1',
                            shortstack = TRUE,
                            sample_folds = 10,
                            silent = TRUE)
summary(late_fit_short)
#> LATE estimation results: 
#>  
#>         Estimate Std. Error   t value  Pr(>|t|)
#> nnls1 -0.2105019   0.195529 -1.076576 0.2816698

Learn More about ddml

Check out our articles to learn more:

For additional applied examples, see our case studies:

Other Double/Debiased Machine Learning Packages

ddml is built to easily (and quickly) estimate common causal parameters with multiple machine learners. With its support for short-stacking, sparse matrices, and easy-to-learn syntax, we hope ddml is a useful complement to DoubleML, the expansive R and Python package. DoubleML supports many advanced features such as multiway clustering and stacking.

References

Ahrens A, Hansen C B, Schaffer M E, Wiemann T (2023). “ddml: Double/debiased machine learning in Stata.” https://arxiv.org/abs/2301.09397

Ahrens A, Hansen C B, Schaffer M E, Wiemann T (2024). “Model averaging and double machine learning.” https://arxiv.org/abs/2401.01645

Angrist J, Evans W, (1998). “Children and Their Parents’ Labor Supply: Evidence from Exogenous Variation in Family Size.” American Economic Review, 88(3), 450-477.

Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C B, Newey W, Robins J (2018). “Double/debiased machine learning for treatment and structural parameters.” The Econometrics Journal, 21(1), C1-C68.

Wolpert D H (1992). “Stacked generalization.” Neural Networks, 5(2), 241-259.