Publications


This page provides information about publications related to the generalizability of causal research. For non-technical overviews, researchers may want to first read one or both of the following publications:

Click here for a full annotated bibliography of the articles in each category. Click here for a description of how we identified relevant articles.

All publications have been categorized based on their content. Some articles have been classified into two or more categories. See below for descriptions of these categories and 2-3 exemplars of articles in each category.

Overviews or Conceptual Frameworks - These papers develop conceptual frameworks for generalizability, use those frameworks to derive measures of generalizability, and/or summarize a range of methods for improving a study’s generalizability. Some examples include:

Empirical Evidence on Generalizability- These papers assess the generalizability of one or more studies by comparing the characteristics of or average impact in the study sample to the characteristics of or average impact in some population. Papers that focus on sample characteristics include:

Papers that examine impacts directly include:

Sample Selection Methods - These papers develop or test the performance of sampling methods (e.g., random sampling, balanced sample) for obtaining representative samples of the population for impact studies. Examples include:

Selection Modelling Methods - These papers develop or test the performance of methods (e.g., propensity score methods) for modelling selection into the study sample and (often) reweighting a study sample to resemble the study’s population. Examples include:

Outcome Modelling Methods - These papers develop or test the performance of regression methods to model outcomes and use the model to predict the population average treatment effect. Examples include:

Doubly Robust Methods - These papers develop or test the performance of methods that combine selection modelling and outcome modelling to make estimates of the population effect more robust to violations of model assumptions. Examples include:

Sensitivity Analysis Methods - These papers develop or test the performance of method for testing the assumptions needed for generalizability, assessing the sensitivity of the findings to those assumptions, or estimating bounds for the population average effect based on weaker assumptions. Examples include:

Transparency in Reporting - These papers recommend information that impact studies should report or assess the extent to which studies report information needed to assess their generalizability. Examples include:

Disclaimer: This list of publications did not result from a systematic review. We compiled papers that we knew, conducted some searches in various databases to identify additional papers, and screened those papers for relevance. See here for more details on how we assembled these publications and classified them based on what they offer.

Have questions about the literature? Or suggestions of papers we should add? Please contact Rob Olsen (robolsen@gwu.edu) or Elizabeth Stuart(estuart@jhu.edu).