Author + information
- Carlotta Perego, MDa,b,
- Marco Sbolli, MDa,b,
- Claudia Specchia, PhDc,
- Mona Fiuzat, PharmDd,
- Zachary R. McCaw, PhDe,
- Marco Metra, MDb,
- Chiara Oriecuia, MScc,
- Giulia Peveri, MScf,
- Lee-Jen Wei, PhDg,
- Christopher M. O’Connor, MDa,d and
- Mitchell A. Psotka, MD, PhDa,∗ (, )@mpsotka
- aDepartment of Heart Failure and Transplant, Inova Heart and Vascular Institute, Falls Church, Virginia
- bDepartment of Medical and Surgical Specialties, Radiological Sciences, and Public Health, University of Brescia, Brescia, Italy
- cDepartment of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
- dDepartment of Medicine, Duke University, Durham, North Carolina
- eGoogle LLC, Mountain View, California
- fDepartment of Clinical Science and Community Health, University of Milan, Milan, Italy
- gDepartment of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- ↵∗Address for correspondence:
Dr. Mitchell A. Psotka, Inova Heart and Vascular Institute, Heart Failure and Transplant, 3300 Gallows Road, Falls Church, Virginia 22042.
Objectives The study sought to demonstrate the statistical and utilitarian properties of restricted mean survival time (RMST) and restricted mean time lost (RMTL) for assessing treatments for heart failure (HF) with reduced ejection fraction.
Background Although the hazard ratio (HR) is the most commonly used measure to quantify treatment effects in HF clinical trials, HRs may be difficult to interpret and require the proportional hazards assumption to be valid. RMST and RMTL are intuitive summaries of groupwise survival that measure treatment effects without model assumptions.
Methods Patient time-to-event data were reconstructed from published landmark HF clinical trial Kaplan-Meier curves. We estimated RMST differences (ΔRMSTs) and RMTL ratios between treatment groups for primary and secondary outcomes, and compared test statistics and effect sizes with proportional hazards models. We fit Weibull estimations to extrapolate trial data to 5 years of treatment.
Results Using RMSTs and RMTLs yielded similar statistical conclusions as HR analysis for a compendium of 16 HF clinical trials including 48,581 patients. RMTL ratios approximated HRs for each trial, but ΔRMSTs provided absolute effect sizes unavailable with HRs. For instance, spironolactone added 2.2 months of life over 34 months of treatment, and dapagliflozin added 0.3 months of life over 24 months of treatment. When normalized to 5-years follow-up with Weibull estimation, spironolactone and dapagliflozin added 6.0 months and 1.8 months of life for patients, respectively.
Conclusions Statistically, RMST and RMTL perform similarly to proportional hazards modeling but may help patients by providing clinically relevant intuitive estimates of treatment effects without prohibitive assumptions.
Cox proportional hazards modeling is the most common method used to describe and compare treatment effects in heart failure (HF) clinical trials (1,2). However, the hazard ratio (HR) derived from proportional hazards modeling is not a suitable in all circumstances, such as when hazards are not proportional during the entire study period, and its utility may be limited for noninferiority analyses (1–4). Additionally, because HRs measure relative and not absolute treatment effects, they may be difficult to meaningfully communicate to patients and nontrialists, particularly in terms of the magnitude of the treatment effect and what a patient can reasonably expect as the outcome of treatment over time (5–9).
Restricted mean survival time (RMST) is an alternative measure that may overcome some of the limitations of proportional hazards modeling. RMST is the average time free from an event up until a milestone time point, a numeric expression of the area under the Kaplan-Meier survival curve (1,2,4,10). The comparison of RMSTs between treatment groups, the RMST difference (ΔRMST), is the mean absolute difference of event-free time associated with a therapy. This difference in survival time measures treatment effects without imposing model assumptions and may be more intuitive to clinical communities (4,11). Additionally, the RMST in the control or placebo group provides a reference value for the population to which the ΔRMST can be anchored to help determine whether a benefit is clinically meaningful, in contrast to the relative treatment effects provided by HRs. Restricted mean time lost (RMTL) is defined as the area above the Kaplan-Meier survival curve (10,12,13) and represents the survival time lost up to a specific time point. The ratio of RMTLs between treatment groups can approximate the HR without requiring the proportional hazards assumption (14).
Given the potential benefits, we sought to investigate whether analyses using RMSTs and RMTL ratios would provide statistically similar results to traditional proportional hazards models, and whether they might be used to better inform patients and clinicians of clinical trial findings. We calculated RMST outputs for published guideline-recommended treatments for HF and created a resource clinicians and HF stakeholders can use to communicate clinical trial results more effectively with patients and the public.
Clinical trial selection
Landmark randomized clinical trials for treatment of HF were identified from the most recent American Heart Association/American College of Cardiology and European Society of Cardiology Guidelines and from the recent medical published reports (15,16). Clinical trials were eligible for inclusion if they reported a statistically significant superiority of the therapy of interest over placebo or an active comparator. At least 1 representative clinical trial was selected from each major class of HF therapeutics. No data were collected from human participants and data were publicly available and the study was exempt from Institutional Review Board approval.
For each clinical trial, we collected the compared treatments, the sample sizes, and the published analyses of the primary and secondary outcomes including the unadjusted HRs, their 95% confidence intervals (CIs) estimated by Cox proportional hazards modeling, and the results of statistical significance testing, whenever available. The outcomes of interest included all-cause mortality, death from cardiovascular causes, and hospitalization for HF, as well as composites of these and nonfatal myocardial infarction or nonfatal stroke.
We used the published Kaplan-Meier survival curves for the primary and secondary endpoints from each clinical trial to recreate their time-to-event data. First, we digitized the Kaplan-Meier survival curves using a semi-automatic freely available tool, WebPlotDigitizer (17). Next, an approximation of the original patient time-to-event data was derived from digitized curves using the iterative algorithm described in Guyot et al. (18).
From the approximated individual patient data, we calculated the RMST, separately for the experimental and control groups, and RMTL at the truncation time point (T∗). The RMST is the area under the Kaplan-Meier curve until time T∗, the mean event-free time among patients in a treatment arm until time T∗. The RMTL is the area above the Kaplan-Meier curve, the mean post-event time across the observation period. For each trial, we identified the latest observed event or censoring time in either the experimental or control arm and defined T∗ as the minimum of the 2 values. The ΔRMST for each trial was calculated by subtracting the RMST for the control arm from the RMST for the intervention arm. The ΔRMST is the difference in mean event-free time between the patients in each treatment arm during the observed time period T∗. A ΔRMST >0 favors the experimental treatment. The ratio of RMTLs between control and experimental treatments for each trial was then calculated by dividing the RMTL of the intervention arm by the RMTL of the control arm. An RMTL ratio <1 indicates superiority of the experimental treatment. The 95% CIs for ΔRMSTs and RMTL ratios, along with p values for the null hypothesis, were calculated based on the methods described in Uno (13).
To estimate the effect of treatment on overall survival up to 5 years, parametric Weibull models were fit to each trial’s data. An estimate of the survival curve S(t) for each treatment arm was obtained, and the RMSTs at t = 60 months were calculated as The 60-month ΔRMST between intervention and control treatments was then calculated. Percentile-based bootstrap 95% CIs of the ΔRMST were calculated using 5,000 bootstrap replicates.
Statistical analyses were performed with package survRM2 (13) and package temporal (19) in the R software version 3.6.2 (R Foundation for Statistical Computing, Vienna, Austria) and with LIFETEST and LIFEREG procedures in the SAS software version 9.4 (SAS Institute, Cary, North Carolina).
Characteristics of selected randomized controlled trials
We selected 16 randomized clinical trials of HF therapies, 12 investigating medical therapies and 4 investigating devices, conducted between April 1985 and July 2019. The analysis included data from a total of 48,581 patients. The characteristics of the selected trials are summarized in Supplemental Table 1. Data on all-cause mortality were available from 12 of 16 trials, 10 from pharmaceutical trials and 2 from device trials. Isolated all-cause mortality was not available for the CHARM- (Candesartan Cilexetil in Heart Failure Assessment of Reduction in Mortality and Morbidity-) Added (20), COPERNICUS (Carvedilol Prospective Randomized Cumulative Survival) (21), MADIT-CRT (Multicenter Automatic Defibrillator Implantation With Cardiac Resynchronization Therapy) (22), and COMPANION (Comparison of Medical Therapy, Pacing and Defibrillation in Heart Failure) (23) trials. Composites of cardiovascular death or hospitalization for HF were available for 6 trials.
All-cause mortality for pharmaceutical trials
Table 1 shows the treatment effects for all-cause mortality from 10 trials evaluating drug therapies expressed as HRs, ΔRMSTs, and RMTL ratios. In these trials the results of statistical testing by Cox proportional hazards modeling, ΔRMST, and RMTL ratios achieved similar results supporting the efficacy of the interventions. Additionally, estimations of the magnitude of the treatment effects were similar between RMTL ratios and HRs. In all cases, the ΔRMST provided an additional measure of mean treatment efficacy that can be described as the months of life gained with the treatment over the course of trial follow-up.
For instance, in the DAPA-HF (Study to Evaluate the Effect of Dapagliflozin on the Incidence of Worsening Heart Failure or Cardiovascular Death in Patients With Chronic Heart Failure) trial, on average treatment with dapagliflozin for 24 months extended each patient’s life by 0.3 (95% CI: 0.04 to 0.6) months compared with placebo, raising their mean survival time from 22.0 (95% CI: 21.7 to 22.2) months to 22.3 (95% CI: 22.1 to 22.5) months. The RMTL of 0.80 (95% CI: 0.70 to 0.98) was similar to the published HR of 0.83 (95% CI: 0.71 to 0.97). From the RALES (Randomized Aldactone Evaluation Study) trial, on average treatment with spironolactone for 34 months extended each patient’s life by 2.2 (95% CI: 1.1 to 3.4) months compared with placebo, raising the mean survival from 23.9 (95% CI: 23.1 to 24.8) months to 26.2 (25.4 to 27.0) months. The published HR for spironolactone in the RALES trial was 0.70 (95% CI: 0.60 to 0.82; p = 0.001), and the RMTL was 0.80 (95% CI: 0.70 to 0.90; p < 0.001).
Composites, hospitalizations, and additional secondary outcomes
Table 2 shows the treatment effects for outcomes other than all-cause mortality from 9 trials evaluating drug therapies, expressed as HRs, ΔRMSTs, and RMTL ratios. The statistical significance testing results were broadly similar between the proportional hazards models, ΔRMSTs, and RMTLs, and the RMTLs generally mirrored the HRs in magnitude of effect. In all cases, the ΔRMST provided an additional measure of average treatment efficacy that can be described as the months of additional life without an event in patients receiving treatment compared with placebo, over the course of trial follow-up. For instance, from the VICTORIA (A Study of Vericiguat in Participants With Heart Failure With Reduced Ejection Fraction) trial, on average, treatment with vericiguat for 32 months extended the time before patients either died of a cardiovascular cause or were admitted to the hospital for HF by 0.9 (95% CI: 0.1 to 1.6) months. The published HR for this endpoint was 0.90 (95% CI: 0.82 to 0.98; p = 0.02), and the RMTL was 0.9 (0.9 to 0.99; p = 0.027).
In all device trials, the results of statistical significance testing between HRs, ΔRMSTs, and RMTL ratios agreed, and the treatment effect magnitudes were similar between HRs and RMTL ratios (Table 3). In each case, the ΔRMST provided an additional measure of average treatment efficacy that can be described as the months of additional life without an event in patients receiving the device compared with placebo, over the course of trial follow-up. For example, in the MADIT II (Multicenter Automatic Defibrillator Implantation Trial II), on average the use of an implantable-cardioverter defibrillator extended each patient’s life by 2.9 (95% CI: 0.8 to 5.0) months compared with those without the device, over 4 years of trial follow-up.
There is a clinical need to estimate the treatment effect when a therapy is given for a time period longer than the relatively short duration studied as part of a clinical trial, and this can be done with RMST analyses. The Central Illustration (and Supplemental Figure 1) graphically illustrates the Weibull projections from the primary data for selected trials for the endpoint of all-cause mortality. Table 4 presents the estimated treatment effects by ΔRMSTs at 5 years created by Weibull model fitting for all-cause mortality in 12 studies. These are expressed as the number of months gained due to the treatment over 5 years. All the trials project longer survival with treatment at 5 years, with various degrees of statistical certainty after accounting for the extrapolation. For example, according to the Weibull models, mean survival would be increased by treatment with enalapril by 8.3 months (95% CI: 1.5 to 15.8 months; p = 0.02) over 5 years compared with placebo based on CONSENSUS trial data (24). Treatment with spironolactone for 5 years would increase mean survival by 6.2 months (95% CI: 3.6 to 8.7 months; p < 0.001) based on the RALES trial (25). Using these 5-year estimations, bisoprolol (CIBIS-II [Cardiac Insufficiency Bisoprolol Study II] trial) (26) and metoprolol (MERIT-HF [Metoprolol Controlled-Release Randomised Intervention Trial in Heart Failure] trial) would result in 4.6 months (95% CI: 2.5 to 6.6 months; p < 0.001) and 3.6 months (95% CI: 1.4 to 5.7 months; p = 0.001) additional survival over 5 years, respectively.
We evaluated the statistical and clinical properties of RMST-based statistics applied to clinical trial data for treatments of HF with reduced ejection fraction. We found that RMST and RMTL analyses provided similar statistical results as proportional hazards models, while ΔRMSTs communicated absolute treatment benefit in a more intuitive manner alongside baseline risk. We provide a clinically useful compendium of ΔRMSTs for commonly cited HF clinical trials for use in discussing the benefits of these medications. Although many patients have difficulty absorbing the information from entire survival curves, patients may better comprehend more compact data synopses (8,9). Thus, RMST summary estimates may be best utilized as part of a multipronged approach alongside clinical trial Kaplan-Meier plots to best communicate clinical evidence to patients and engage in shared decision-making. In addition, we demonstrated that ΔRMSTs can be used to estimate the likely treatment effect carried forward over time, giving patients and clinicians greater understanding of the expected longer-term benefits and potentially facilitating comparisons of therapies on a standard time scale.
We compared our RMST analyses directly with statistical assessments and HRs from proportional hazards modeling, the technique widely used in cardiovascular clinical trials but whose results are not an immediately intuitive nor a universally appropriate measure of the treatment effect. The HR requires hazards to be proportional during the entire study period, an assumption that may be invalid in many cases, such as with early or delayed treatment effects, diminishing effects, or crossing survival curves (2,10). RMST, defined as the average event-free time up to a specific time point, is an alternative measure to estimate the treatment benefit and assess statistical significance (4). RMST is a model-free measure that does not require assumptions about hazards (27,28). The ΔRMST provides a treatment effect with an absolute survival-time difference, which is more easily interpretable by clinicians and patients (4,29). Although RMST analyses have rarely been used to assess cardiovascular clinical trials, they are increasingly used for noncardiovascular trials (1,28). Our results suggest that RMST can be routinely applied to HF clinical trials, and that they may add value to traditional proportional hazards analyses for all cardiovascular trials.
The ΔRMST between the treatment and control groups describes the treatment effect as the number of months free from an event over the study period. Our RMST reanalysis of all selected HF clinical trials quantified the treated effect in addition to supporting the conclusions presented in the original papers. We found that RMST did not alter the statistical conclusions when compared with HR. Our results are supported by evaluations of RMST for other cardiovascular clinical trials; RMST analyses produced similar statistical assessment of the treatment effect for CHARM- (Candesartan Cilexetil in Heart Failure Assessment of Reduction in Mortality and Morbidity-) Overall and ISAR-TEST 4 (3 Limus Agent Eluting Stents With Different Polymer Coating) trials (1). Additionally, RMST analyses of 54 oncological clinical trials also demonstrated agreement between RMST and HR in both the direction of the treatment effect and the statistical conclusions (11). Although the RMTL ratio gives a similar result as the HR, it has the statistical advantage of not requiring proportionality of hazards. Our RMTL ratios were consistent with HRs, which is in line with previous noncardiovascular studies of RMTL ratios (10,30).
There may be multiple reasons why RMST analyses have remained infrequently used for cardiovascular clinical trials. One may be that ΔRMSTs appear to show a relatively small effect size of months of life gained with years of therapy (Table 1). Nonetheless, these results are far greater in magnitude than the mortality benefits demonstrated for medications used to treat diabetes mellitus analyzed using RMST methods (31). Years of treatment with antihyperglycemic medications is only associated with days of survival gained in comparison with months with HF medications. Another concern has been the use of a specified truncation time point for RMST analyses; however, it has been shown that RMST assessments can use all data from a trial up to the largest follow-up time (32). Additional limitations of RMST analyses are also shared equally by proportional hazards models, including dependence on the baseline event rate and risk in the trial population, and the need to assess the RMST in the context of the length and shape of the survival curves (31).
The small size of the calculated ΔRMSTs is in part due to the relatively short follow-up time in clinical trials compared with the long-term use of HF treatments in real-world practice. Owing to cost and complexity, and the desire to speed evidence generation for patients with HF, extending clinical trial follow-up time is typically infeasible (33). Our data corroborate the clinical observation and common-sense belief that HF medications and devices continue to be effective for longer than studied in clinical trials (34). As an alternative to extending clinical trial duration, we show that RMST estimates of treatment effects can be extrapolated to 5 years to approximate this clinically useful information. Although a popular alternative method for estimating the long-term benefit of HF medical therapies assumes that the treatment effect does not depend on the duration of therapeutic exposure, our method does not (34). For clinical trials that enrolled similar patient populations but with different treatment times, this estimation method can also be used to compare the treatment effects between those therapies.
These data are not meant to suggest that the benefits of HF medical and device therapies are small. In contrast, we have shown that the longer patients use medications and devices to treat HF, the more time can be added to their overall survival and survival without morbid events. This is a crucial message for clinicians and patients engaged in shared decision making. In addition, patients often have difficulty understanding data from clinical trials; approximately 50% of untrained and trained laypeople were unable to appropriately interpret and compare survival curves in 1 analysis, and thus our intuitive results of RMST calculations fill a clinical need to help patients comprehend trial data so they may better participate in their clinical care (8, 9).
Because we did not have access to original time-to-event data, we reconstructed data from the published Kaplan-Meier curves through the Guyot algorithm (18). This method has been previously used in published reports but could account for some minor differences if the primary data were used to make these calculations (11,18,29). Nonetheless, many of these trials’ data are proprietary and would have been unavailable otherwise. We recommend prospective RMST analysis with patient-level data for future HF clinical trials to add to the catalog of RMST results we have created. Similar to comparisons using proportional hazards models, the use of different follow-up times and distinct population attributes make the effect sizes of the selected trials difficult to compare with each other. Although we acknowledge that no statistical estimation of therapeutic efficacy can extrapolate with certainty beyond the clinical trial time frame, we believe that some estimates of treatment effect can be clinically useful, particularly as many clinicians mentally approximate these values when deciding on treatment regimens for their patients. We believe that the Weibull model is a reasonable projection of the 5-year ΔRMST, though it cannot always be used to directly compare treatment effects between trials because of differences in the populations under study.
Among landmark HF clinical trials, RMST and RMTL analyses of multiple endpoints convey treatment effect estimates consistent with HR in a more patient-oriented manner. RMST-based results could be used to better communicate treatment effects to patients in order to assist patient-preference discussions and shared decision making, and these data should help clinicians to become facile in their use and interpretation.
TRANSLATIONAL OUTLOOK 1: RMST analyses provide similar statistical robustness as proportional hazards models for trial outcomes but do not require the same limiting statistical assumptions, and they should be prospectively utilized for clinical trial assessments.
TRANSLATIONAL OUTLOOK 2: RMST analyses provide both a relative and absolute assessment of the effect size of both pharmaceutical and device interventions for the treatment of HF, which can be incorporated into the clinical published reports for clinicians and patients.
COMPETENCY IN PATIENT CARE: Absolute treatment effects and comparisons from RMST analyses allow clinicians to communicate the effects of medications for HF in an intuitive manner with patients.
Dr. O’Connor has received research support from Roche Diagnostics and Merck; has served as a consultant for Merck, Bristol-Myers Squibb, Windtree, and Neurotronik; and is a co-owner of Biscardia. Dr. Psotka has received consulting fees from Amgen, Cytokinetics, and Windtree; and research funding from the U.S. Food and Drug Administration. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose. Barry Greenberg, MD, was Guest Editor on this paper.
The authors attest they are in compliance with human studies committees and animal welfare regulations of the authors’ institutions and Food and Drug Administration guidelines, including patient consent where appropriate. For more information, visit the JACC: Heart Failure author instructions page.
- Abbreviations and Acronyms
- restricted mean survival time difference
- confidence interval
- heart failure
- hazard ratio
- restricted mean survival time
- restricted mean time lost
- truncation time point
- Received April 30, 2020.
- Revision received June 24, 2020.
- Accepted July 27, 2020.
- 2020 The Authors
- Gregson J.,
- Sharples L.,
- Stone G.W.,
- Burman C.F.,
- Öhrn F.,
- Pocock S.
- Kim D.H.,
- Uno H.,
- Wei L.J.
- Belot A.,
- Ndiaye A.,
- Luque-Fernandez M.A.,
- et al.
- Trinquart L.,
- Bill-Axelson A.,
- Rider J.R.
- Weir I.R.,
- Marshall G.D.,
- Schneider J.I.,
- et al.
- Uno H.,
- Claggett B.,
- Tian L.,
- et al.
- Trinquart L.,
- Jacot J.,
- Conner S.C.,
- Porcher R.
- Cronin A.,
- Tian L.,
- Uno H.
- Uno H.
- Uno H.,
- Claggett B.,
- Tian L.,
- et al.
- Yancy C.W.,
- Jessup M.,
- Bozkurt B.,
- et al.
- ↵McCaw ZR. Temporal. Available at: https://github.com/zrmacc/Temporal. Accessed April 29, 2020.
- Packer M.,
- Fowler M.B.,
- Roecker E.B.,
- et al.
- Abulizi X.,
- Ribaudo H.J.,
- Flandre P.
- Liang F.,
- Zhang S.,
- Wang Q.,
- Li W.
- Kloecker D.E.,
- Davies M.J.,
- Khunti K.,
- Zaccardi F.
- Tian L.,
- Jin H.,
- Uno H.,
- et al.
- O’Connor C.M.,
- Psotka M.A.,
- Fiuzat M.,
- et al.
- Vaduganathan M.,
- Claggett B.L.,
- Jhund P.S.,
- et al.
- Armstrong P.W.,
- Pieske B.,
- Anstrom K.J.,
- et al.