Covariate Selection for Generalizing Experimental Results
Researchers are often interested in generalizing the average treatment effect (ATE) estimated in a randomized experiment to non-experimental target populations. Researchers can estimate the population ATE without bias if they adjust for a set of variables affecting both selection into the experiment and treatment heterogeneity.Although this separating set has simple mathematical representation, it is often unclear how to select this set in applied contexts. In this paper, we propose a data-driven method to estimate a separating set. Our approach has two advantages. First, our algorithm relies only on the experimental data. As long as researchers can collect a rich set of covariates on experimental samples, the proposed method can inform which variables they should adjust for. Second, we can incorporate researcher-specific data constraints. When researchers know certain variables are unmeasurable in the target population, our method can select a separating set subject to such constraints, if one is feasible. We validate our proposed method using simulations, including naturalistic simulations based on real-world data.
Co-Sponsored with The Center for Social Statistics