Background: Randomized clinical trials (RCTs) are designed to produce evidence in selected populations. Assessing their effects in the real-world is essential to change medical practice, however, key populations are historically underrepresented in the RCTs. We define an approach to simulate RCT-based effects in real-world settings using RCT digital twins reflecting the covariate patterns in an electronic health record (EHR). Methods: We developed a Generative Adversarial Network (GAN) model, TwinRCT-GAN, which generates a digital twin of an RCT (TwinRCT) conditioned on covariate distributions from an EHR cohort. We improved upon a traditional tabular conditional GAN, CTGAN, with a loss function adapted for data distributions and by conditioning on multiple discrete and continuous covariates simultaneously. We assessed the similarity between a Heart Failure with preserved Ejection Fraction (HFpEF) RCT (TOPCAT), a Yale HFpEF EHR cohort, and TwinRCT. We also evaluated cardiovascular event-free survival stratified by Spironolactone (treatment) use. Results: By applying TwinRCT-GAN to 3445 TOPCAT participants and conditioning on 3445 Yale EHR HFpEF patients, we generated TwinRCT datasets between 1141-3445 patients in size, depending on covariate conditioning and model parameters. TwinRCT randomly allocated spironolactone (S)/ placebo (P) arms like an RCT, was similar to RCT by a multi-dimensional distance metric, and balanced covariates (median absolute standardized mean difference (MASMD) 0.017, IQR 0.0034-0.030). The 5 EHR-conditioned covariates in TwinRCT were closer to the EHR compared with the RCT (MASMD 0.008 vs 0.63, IQR 0.005-0.018 vs 0.59-1.11). TwinRCT reproduced the overall effect size seen in TOPCAT (5-year cardiovascular composite outcome odds ratio (95\% confidence interval) of 0.89 (0.75-1.06) in RCT vs 0.85 (0.69-1.04) in TwinRCT). Conclusions: TwinRCT-GAN simulates RCT-derived effects in real-world patients by translating these effects to the covariate distributions of EHR patients. This key methodological advance may enable the direct translation of RCT-derived effects into real-world patient populations and may enable causal inference in real-world settings.
Competing Interest StatementThe authors are coinventors of a provisional patent related to the current work (63/606,203). EKO is a co-inventor of the U.S. Patent Applications 63/508,315 & 63/177,117, a cofounder of Evidence2Health (with RK), and has previously served as a consultant to Caristo Diagnostics Ltd (outside the present work). RK is an Associate Editor of JAMA. He receives support from the Doris Duke Charitable Foundation (under award, 2022060). He also receives research support, through Yale, from Bristol-Myers Squibb, Novo Nordisk, and BridgeBio. He is a coinventor of U.S. Provisional Patent Applications 63/177,117, 63/428,569, 63/346,610, 63/484,426, 63/508,315, and 63/606,203 and is a co-founder of Ensight-AI and Evidence2Health, health platforms to improve cardiovascular diagnosis and evidence-based cardiovascular care.
Funding StatementThis study was funded by: K23HL153775 5T32HL155000-03 1F32HL170592-01
Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
IRB of Yale University waived ethical approval and granted a waiver of consent for this work since this was a retrospective study with minimal risks to subjects.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Comments (0)