Selection Modelling Methods-Applications


Bonander, C., Nilsson, A., Bergström, G. M. L., Björk, J., & Strömberg, U. (2021). Correcting for selective participation in cohort studies using auxiliary register data without identification of non-participants. Scandinavian Journal of Public Health, 49(4), 449–456.

  • Aims: Selective participation may hamper the validity of population-based cohort studies. The resulting bias can be alleviated by linking auxiliary register data to both the participants and the non-participants of the study, estimating propensity scores for participation and correcting for participation based on these. However, registry holders may not be allowed to disclose sensitive data on (invited) non-participants. Our aim is to provide guidance on how adequate bias correction can be achieved by using auxiliary register data but without disclosing information that could be linked to the subset of non-participants.
  • Methods: We show how existing methods can be used to estimate generalisation weights under various data disclosure scenarios where invited non-participants are indistinguishable from uninvited ones. We also demonstrate how the methods can be implemented using Nordic register data.
  • Results: Inverse-probability-of-sampling weights estimated within a random sample of the target population in which the non-respondents are disclosed are equivalent in expectation to analogous weights in a scenario where the non-participants and uninvited individuals from the population are indistinguishable. To minimise the risk of disclosure when the entire population is invited to participate, investigators should instead consider inverse-odds-of-sampling weights, a method that has previously been suggested for transporting study results to external populations.
  • Conclusions: Generalisation weights can be estimated from auxiliary register data without disclosing information on invited non-participants.

Bonander, C., Nilsson, A., Björk, J., Bergström, G. M. L., & Strömberg, U. (2019). Participation weighting based on sociodemographic register data improved external validity in a population-based cohort study. Journal of Clinical Epidemiology, 108, 54–63.

  • OBJECTIVE: To investigate whether inverse probability of participation weighting (IPPW) using register data on sociodemographic and disease history variables can improve external validity in a cohort study with selective participation.
  • STUDY DESIGN AND SETTING: We fitted various IPPW models by logistic regression using register data for the participants (n = 1,111) and nonparticipants (n = 1,132) of a Swedish cohort study. For each of six diagnostic groups, we then estimated (1) weighted disease prevalence proportions and (2) weighted cross-sectional associations (odds ratios) between sociodemographic variables and disease prevalence. Using register data on the remaining individuals of the entire study population of men and women aged 50-64 years (n = 22,259), we addressed how the choice of variables used for IPPW influenced estimation errors.
  • RESULTS: Disease prevalence proportions were generally underestimated in the absence of IPPW but became markedly closer to population values after IPPW using sociodemographic variables. We found limited evidence of selective participation bias in association estimates, but IPPW improved external validity when bias was present.
  • CONCLUSIONS: IPPW using sociodemographic register data can improve the external validity of disease prevalence estimates in cohort studies with selective participation. The performance of IPPW for association estimates merits further investigations in longitudinal settings and larger cohorts.

Chung, J. W., Bilimoria, K. Y., Stulberg, J. J., Quinn, C. M., & Hedges, L. V. (2018). Estimation of Population Average Treatment Effects in the FIRST Trial: Application of a Propensity Score‐Based Stratification Approach. Health Services Research, 53(4), 2567–2590.

  • Objective/Study Question: To estimate and compare sample average treatment effects (SATE) and population average treatment effects (PATE) of a resident duty hour policy change on patient and resident outcomes using data from the Flexibility in Duty Hour Requirements for Surgical Trainees Trial (“FIRST Trial”).
  • Data Sources/Study Setting: Secondary data from the National Surgical Quality Improvement Program and the FIRST Trial (2014–2015). Study Design The FIRST Trial was a cluster-randomized pragmatic noninferiority trial designed to evaluate the effects of a resident work hour policy change to permit greater flexibility in scheduling on patient and resident outcomes. We estimated hierarchical logistic regression models to estimate the SATE of a policy change on outcomes within an intent-to-treat framework. Propensity score-based poststratification was used to estimate PATE.
  • Data Collection/Extraction Methods: This study was a secondary analysis of previously collected data. - Principal Findings: Although SATE estimates suggested noninferiority of outcomes under flexible duty hour policy versus standard policy, the noninferiority of a policy change was inconclusively noninferior based on PATE estimates due to imprecision.
  • Conclusions: Propensity score-based poststratification can be valuable tools to address trial generalizability but may yield imprecise estimates of PATE when sparse strata exist.

Inoue, K., Hsu, W., Arah, O. A., Prosper, A. E., Aberle, D. R., & Bui, A. A. T. (2021). Generalizability and Transportability of the National Lung Screening Trial Data: Extending Trial Results to Different Populations. Cancer Epidemiology, Biomarkers & Prevention : A Publication of the American Association for Cancer Research, Cosponsored by the American Society of Preventive Oncology, 30(12), 2227–2234.

  • BACKGROUND: Randomized controlled trials (RCT) play a central role in evidence-based healthcare. However, the clinical and policy implications of implementing RCTs in clinical practice are difficult to predict as the studied population is often different from the target population where results are being applied. This study illustrates the concepts of generalizability and transportability, demonstrating their utility in interpreting results from the National Lung Screening Trial (NLST).
  • METHODS: Using inverse-odds weighting, we demonstrate how generalizability and transportability techniques can be used to extrapolate treatment effect from (i) a subset of NLST to the entire NLST population and from (ii) the entire NLST to different target populations.
  • RESULTS: Our generalizability analysis revealed that lung cancer mortality reduction by LDCT screening across the entire NLST [16% (95% confidence interval [CI]: 4-24)] could have been estimated using a smaller subset of NLST participants. Using transportability analysis, we showed that populations with a higher prevalence of females and current smokers had a greater reduction in lung cancer mortality with LDCT screening [e.g., 27% (95% CI, 11-37) for the population with 80% females and 80% current smokers] than those with lower prevalence of females and current smokers.
  • CONCLUSIONS: This article illustrates how generalizability and transportability methods extend estimation of RCTs’ utility beyond trial participants, to external populations of interest, including those that more closely mirror real-world populations. IMPACT: Generalizability and transportability approaches can be used to quantify treatment effects for populations of interest, which may be used to design future trials or adjust lung cancer screening eligibility criteria.

Jones, G. T., Jones, E. A., Beasley, M. J., Macfarlane, G. J., & MUSICIAN study team. (2017). Investigating generalizability of results from a randomized controlled trial of the management of chronic widespread pain: The MUSICIAN study. PAIN, 158(1), 96–102. Academic Search Alumni Edition.

  • The generalisability of randomised controlled trials will be compromised if markers of treatment outcome also affect trial recruitment. In a large trial of chronic widespread pain, we aimed to determine the extent to which randomised participants represented eligible patients, and whether factors predicting randomisation also influenced trial outcome.
  • Adults from 8 UK general practices were surveyed to determine eligibility for a trial of 2 interventions (exercise and cognitive behavioural therapy [CBT]). Amongst those eligible, logistic regression identified factors associated with reaching the randomisation step in the recruitment process. The main trial analysis was recomputed, weighting for the inverse of the likelihood of reaching the randomisation stage, and the numbers needed to treat were calculated for each treatment. Eight hundred eighty-four persons were identified as eligible for the trial, of whom 442 (50%) were randomised.
  • Several factors were associated with the likelihood of reaching the randomisation stage: higher body mass index (odds ratio: 1.99; 0.85-4.61); more severe/disabling pain (1.90; 1.21-2.97); having a treatment preference (2.11; 1.48-3.00); and expressing positivity about interventions offered (exercise: 2.66; 1.95-3.62; CBT: 3.20; 2.15-4.76).
  • Adjusting for this selection bias decreased the treatment effect associated with exercise and CBT but increased that observed for combined therapy. All were associated with changes in numbers needed to treat. This has important implications for the design and interpretation of pain trials generally. [ABSTRACT FROM AUTHOR]

Susukida, R., Crum, R. M., Ebnesajjad, C., Stuart, E. A., & Mojtabai, R. (2017). Generalizability of findings from randomized controlled trials: Application to the National Institute of Drug Abuse Clinical Trials Network. Addiction, 112(7), 1210–1219.

  • Aims: To compare randomized controlled trial (RCT) sample treatment effects with the population effects of substance use disorder (SUD) treatment. Design Statistical weighting was used to re-compute the effects from 10 RCTs such that the participants in the trials had characteristics that resembled those of patients in the target populations. Settings Multi-site RCTs and usual SUD treatment settings in the United States.
  • Participants: A total of 3592 patients in 10 RCTs and 1 602 226 patients from usual SUD treatment settings between 2001 and 2009.
  • Measurements: Three outcomes of SUD treatment were examined: retention, urine toxicology and abstinence. We weighted the RCT sample treatment effects using propensity scores representing the conditional probability of participating in RCTs.
  • Findings: Weighting the samples changed the significance of estimated sample treatment effects. Most commonly, positive effects of trials became statistically non-significant after weighting (three trials for retention and urine toxicology and one trial for abstinence); also, non-significant effects became significantly positive (one trial for abstinence) and significantly negative effects became non-significant (two trials for abstinence). There was suggestive evidence of treatment effect heterogeneity in subgroups that are under- or over-represented in the trials, some of which were consistent with the differences in average treatment effects between weighted and unweighted results.
  • Conclusions: The findings of randomized controlled trials (RCTs) for substance use disorder treatment do not appear to be directly generalizable to target populations when the RCT samples do not reflect adequately the target populations and there is treatment effect heterogeneity across patient subgroups.

Susukida, R., Crum, R. M., Stuart, E. A., Ebnesajjad, C., & Mojtabai, R. (2016). Assessing sample representativeness in randomized controlled trials: Application to the National Institute of Drug Abuse Clinical Trials Network. Addiction, 111(7), 1226–1234.

  • Aims: To compare the characteristics of individuals participating in randomized controlled trials (RCTs) of treatments of substance use disorder (SUD) with individuals receiving treatment in usual care settings, and to provide a summary quantitative measure of differences between characteristics of these two groups of individuals using propensity score methods.
  • Design: Analyses using data from RCT samples from the National Institute of Drug Abuse Clinical Trials Network (CTN) and target populations of patients drawn from the Treatment Episodes Data Set—Admissions (TEDS-A). Settings Multiple clinical trial sites and nation-wide usual SUD treatment settings in the United States.
  • Participants: A total of 3592 individuals from 10 CTN samples and 1 602 226 individuals selected from TEDS-A between 2001 and 2009.
  • Measurements: The propensity scores for enrolling in the RCTs were computed based on the following nine observable characteristics: sex, race/ethnicity, age, education, employment status, marital status, admission to treatment through criminal justice, intravenous drug use and the number of prior treatments.
  • Findings: The proportion of those with ≥ 12 years of education and the proportion of those who had full-time jobs were significantly higher among RCT samples than among target populations (in seven and nine trials, respectively, at P < 0.001). The pooled difference in the mean propensity scores between the RCTs and the target population was 1.54 standard deviations and was statistically significant at P < 0.001.
  • Conclusions: In the United States, individuals recruited into randomized controlled trials of substance use disorder treatments appear to be very different from individuals receiving treatment in usual care settings. Notably, RCT participants tend to have more years of education and a greater likelihood of full-time work compared with people receiving care in usual care settings.

Susukida, R., Crum, R. M., Stuart, E. A., & Mojtabai, R. (2018). Generalizability of the findings from a randomized controlled trial of a web-based substance use disorder intervention. American Journal on Addictions, 27(3), 231–237. Academic Search Alumni Edition.

  • Background and Objectives: There is growing concern regarding the generalizability of findings from randomized controlled trials (RCTs) of interventions for substance use disorders (SUDs). This study used a selection model approach to assess and improve the generalizability of an evaluation for a web-based SUD intervention by making the trial sample resemble the target population.
  • Methods: The sample of the web-based SUD intervention (Therapeutic Education System vs. Treatment-as-usual; n = 507) was compared with the target population of SUD treatment-seeking individuals from the Treatment Episodes Data Set-Admissions (TEDS-A). Using weights based on the probabilities of RCT participation, we computed weighted treatment effects on retention and abstinence.
  • Results: Substantial differences between the RCT sample and the target population was demonstrated in significant difference in the mean propensity scores (1.62 standard deviations at p < .001). The population effect on abstinence (12 weeks and 6 months) was statistically insignificant after weighting the data with the generalizability weight.
  • Discussions and Conclusions: Generalizability of the findings from the RCT could be limited when the RCT sample does not well represent the target population.Scientific Significance: Application of generalizability weights can be a potentially useful tool to improve generalizability of RCT findings. (Am J Addict 2018;27:231-237). [ABSTRACT FROM AUTHOR]

Takeshima, H. (2018). Mechanize or exit farming? Multiple‐treatment‐effects model and external validity of adoption impacts of mechanization among Nepalese smallholders. Review of Development Economics, 22(4), 1620–1641. Business Source Alumni Edition.

  • The future of smallholders in developing countries is becoming increasingly uncertain in the face of rising farm wages. The custom‐hiring of tractors, in which tractor owners provide non‐owner farmers with land preparation and transport services for fees, has spread among smallholders in Asia, including Nepal. However, estimating the adoption impacts of agricultural mechanization by smallholders is complex as we must also take into account smallholders’ options to exit farming.
  • We investigate this issue by applying multinomial logit inverse‐probability weighting and sample selection panel data methods to data on smallholders in lowland Nepal. Our results are generally consistent with the hypothesis that smallholders who are likely to benefit more from adopting tractors are also more likely to exit farming. Where smallholders are less likely to exit farming, the use of tractors through custom‐hiring may help smallholders on average to earn greater total and agricultural incomes. However, where they are more likely to exit farming, the ability of custom‐hired tractors to sustain smallholder farming systems may become weaker.
  • The results also offer insights into how the external validity of technology adoption impact evaluation may be affected in some settings. [ABSTRACT FROM AUTHOR]

Wagner, T. H., Holman, W., Lee, K., Sethi, G., Ananth, L., Thai, H., & Goldman, S. (2011). The generalizability of participants in Veterans Affairs Cooperative Studies Program 474, a multi-site randomized cardiac bypass surgery trial. Contemporary Clinical Trials, 32(2), 260–266. Academic Search Alumni Edition.

  • Objective: The Department of Veterans Affairs (VA) Cooperative Studies Program (CSP) initiated a multi-site randomized trial (CSP 474) to determine graph patency between radial artery or saphenous vein grafts in coronary artery bypass surgery (CABG). In this paper, we describe the study and compare participants’’ baseline characteristics to non-participants who received CABG surgery in the VA.
  • Method: We identified our participants in the VA administrative databases along with all other CABG patients who did not have a concomitant valve procedure between FY2003 and FY2008. We extracted demographic, clinical information and organizational information at the time of the surgery from the databases. We conducted multiple logistic regression to determine characteristics associated with participation at three levels: between participants and non-participants within participating sites, between participating sites and non-participating sites, between participants and all non-participants.
  • Results: Enrollment ended in early 2008. Participants were similar to non-participants across many parameters. Likewise, participating sites were also quite similar to non-participating sites, although participating sites had a higher volume of CABG surgery, a lower percentage of CABG patients with a prior inpatient mental health admission than non-participating sites. After controlling for site differences, CSP 474 participants were younger and had fewer co-morbid conditions than non-participants.
  • Conclusions: Participants were significantly younger than non-participants. Participants also had lower rates of some cardiac-related illness including, congestive heart failure, peripheral vascular disease, and cerebrovascular disease than non-participants. [Copyright &y& Elsevier]