Skip to contents

Perform telescope matching to estimate the controlled direct effect of a binary treatment net the effect of binary mediators

Usage

telescope_match(
  formula,
  data,
  caliper = NULL,
  L = 5,
  verbose = TRUE,
  subset,
  contrasts = NULL,
  separate_bc = TRUE,
  ...
)

Arguments

formula

A formula object that specifies the covariates and treatment variables (or mediators) in causal ordering from oldest to newest with each group separated by |. See below for more details.

data

A dataframe containing variables referenced by formula.

caliper

A scalar denoting the caliper to be used in matching in the treatment stage (calipers cannot be used for matching on the mediator). Observations outside of the caliper are dropped. Calipers are specified in standard deviations of the covariates. NULL by default (no caliper).

L

Number of matches to use for each unit. Must be a numeric vector of either length 1 or 2. If length 1, L sets the number of matches used in both the first stage (matching on mediator) and in the second stage (matching on treatment). If length 2, the first element sets the number of matches used in the first stage (matching on mediator) and the second element sets the number of matches used in the second stage (matching on treatment) Default is 5.

verbose

logical indicating whether to display progress information. Default is TRUE.

subset

A vector of logicals indicating which rows of data to keep.

contrasts

a list to be passed to the contrasts.arg argument of model.matrix() when generating the data matrix.

separate_bc

logical indicating whether or not bias correction regressions should be run separately within levels of the treatment and mediator. Defaults to TRUE. If TRUE, any interactions between treatment/mediator and covariates in the specification should be omitted.

...

For lm(): additional arguments to be passed to the low level regression fitting functions (see below).

Value

Returns an object of class tmatch. Contains the following components

  • call: the matched call.

  • formula: formula used to fit the model.

  • m_out: list of matching solutions at each time point. Each member of the list has a `matches` list giving the units matched to that unit, a `donors` list with the units to which the unit is matched, and a `tr` vector which is just the treatment vector being matched.

  • K: data.frame of indicating how many times a unit has been used as a match, directly in each period and indirectly across periods.

  • L: vector of matching ratios used in each period.

  • r_out: nested list of regression imputations used in the bias correction. The first level of the list varies across different controlled direct effects (different sequences of future treatments/mediators). Each of these is a list of time periods and each of these time periods is a list of `yhat_r_0` and `yhat_r_1` that give the regression predictions for the potential outcomes at that time point when the treatment at that time point is 0 or 1, respectively, along with `n_coefs` giving the number of coefficients estimated in those models.

  • tau: vector of bias-corrected estimates of average controlled direct effects for different vectors of future treatments/mediators.

  • tau_raw: vector of standard matching estimates of average controlled direct effects for different vectors of future treatments/mediators without using bias correction.

  • tau_se: vector of estimated standard errors for the average controlled direct effects estimates for different vectors of future treatments/mediators.

  • tau_i: matrix of individuals contributions to the ACDE estimates (units on rows, different ACDEs on columns). Used for weighted bootstrap.

  • included: logical vector indicating if each row of data was included in estimating tau.

  • effects: data frame where each row describes the different ACDEs in tau. The active column describes the which variable's direct effect is being assessed and the rest of the columns describe the fixed values of the future treatments/mediators for that ACDE.

  • a_names: character vector with the names of the treatment/mediator variables used in estimation.

  • caliper: caliper (if any) used in matching to drop distant observations.

Details

The telescope_match function implements the two-stage "telescope matching" procedure developed by Blackwell and Strezhnev (2021).

The procedure first estimates a demediated outcome using a combination of matching and a regression bias-correction. The data.frame passed to data should be in the wide format so that each row corresponds to a single unit and treatments and covariates from different time periods appear as different columns. The formula argument specifies both the causal ordering of the variables and the regression specifications for the bias correction. It should be of the form Y ~ X1 | A1 | X2 | A2, where Y is the outcome, X1 is a formula of baseline covariates, A1 is a single variable name indicating the binary treatment in the first period, X2 is a formula of covariates in period 2, and A2 is a single variable name indicating treatment in period 2 (which is also sometimes called the mediator). Note that it is possible to add more covariate/treatment pairs for additional time periods.

Under the default separate_bc == TRUE, the function will match for each treatment/mediator based on the the covariates up to that point within levels of past treatments (so for A2 this matching finds units with similar values of X1 and X2 and the same value of A1). Once this matching is complete, the function moves backward through treatments and imputes potential outcomes using matches and bias-correction regressions, which regress the current imputed potential outcome on the past covariates, within levels of the treatment history up to the current period. The functional form comes from the specification in formula. Controlled direct effects of A1 are estimated for every possible combination of future treatments.

When separate_bc is FALSE, the bias correction regressions are not broken out by the treatments/mediators and those variables are simply included as separate regressors as specified in formula. In this setting, interactions between the treatment/mediator and covariates can be added on a selective basis to the covariate block (X1 or X2 and so on) specifications.

Matching is performed using the Match() routine from the Matching package. By default, matching is L-to-1 nearest neighbor with replacement using Mahalanobis distance.

See the references below for more details.

References

Blackwell, Matthew, and Strezhnev, Anton (2020) "Telescope Matching: Reducing Model Dependence in the Estimation of Direct Effects." Journal of the Royal Statistical Society (Series A). doi:10.1111/rssa.12759

Examples

data(jobcorps)

## Split male/female
jobcorps_female <- subset(jobcorps, female == 1)

## Telescope matching formula - First stage (X and Z)
tm_form <- exhealth30 ~  schobef + trainyrbef + jobeverbef  |
treat | emplq4 + emplq4full | work2year2q


### Estimate ACDE for women holding employment at 0
tm_out <-  telescope_match(
  tm_form,
  data = jobcorps_female,
  L = 3,
  boot = FALSE,
  verbose = TRUE
)
#> Beginning matching...
#> Matching work2year2q...
#> Matching treat...
#> Beginning bias correction...