Author + information
- Received January 25, 2016
- Revision received March 14, 2016
- Accepted April 27, 2016
- Published online October 1, 2016.
- Mitchell A. Psotka, MD, PhDa,
- Robyn von Maltzahn, PhDb,
- Milena Anatchkova, PhDc,
- Irene Agodoa, MDd,
- Dina Chau, PharmDd,
- Fady I. Malik, MD, PhDe,
- Donald L. Patrick, MPH, PhDf,
- John A. Spertus, MPH, MDg,
- Ingela Wiklund, PhDb and
- John R. Teerlink, MDa,∗ ()
- aSchool of Medicine, University of California San Francisco and Section of Cardiology, San Francisco Veterans Affairs Medical Center, San Francisco, California
- bEvidera, London, United Kingdom
- cEvidera, Bethesda, Maryland
- dAmgen Inc., Thousand Oaks, California
- eCytokinetics, Inc., South San Francisco, California
- fDepartment of Health Services, University of Washington, Seattle, Washington
- gSaint Luke’s Mid America Heart Institute and Department of Biomedical and Health Informatics, University of Missouri, Kansas City, Missouri
- ↵∗Reprint requests and correspondence:
Dr. John R. Teerlink, San Francisco VA Medical Center, Cardiology, 111C; Building 203, Room 2A-49, 4150 Clement Street, San Francisco, California 94121-1545.
Objectives The study sought to review the characteristics of existing patient-reported outcome (PRO) instruments used with chronic heart failure (HF) patients and evaluate their potential to support an approved U.S. Food and Drug Administration (FDA) product label claim.
Background PROs, including symptoms and their associated functional limitations, contribute substantially to HF patient morbidity. PRO measurements capture the patient perspective and can be systematically assessed with structured questionnaires, however rigorous recommendations have been set by the FDA regarding the acceptability of PRO measures as a basis for product label claims.
Methods Extensive searches of databases and specialty guidelines identified PRO instruments used in patients with chronic HF. Information on critical properties recommended by the FDA guidance were systematically extracted and used to evaluate the selected PRO instruments.
Results Nineteen PRO instruments used with chronic HF patients were identified. The Kansas City Cardiomyopathy Questionnaire and Minnesota Living with Heart Failure Questionnaire were the most extensively evaluated and validated in studies of this population. However, judged by criteria listed in the FDA PRO guidance, no existing PRO measure met all of the criteria to support a product label claim in the United States.
Conclusions Currently available chronic HF PRO measures do not fulfill all the recommendations provided in the FDA PRO guidance and therefore may not support an FDA-approved product label claim. Future investigations are merited to develop a PRO measure for use in patients with chronic HF in accordance with the FDA guidance.
- chronic heart failure
- clinical management
- European Medicines Agency
- Food and Drug Administration
- patient-reported outcomes
- regulatory approval
The primary goals of heart failure (HF) treatment are to prevent hospitalization and maximize survival as well as health status: patients’ symptoms, function, and well-being, sometimes referred to as health-related quality of life (HRQoL) (1). Evaluating these patient-reported outcomes (PROs) requires collecting data directly from patients themselves and is incorporated into the most recent HF performance measures and scientific statements (1–5). Ascertaining HF patient health status is complementary to other endpoints in clinical management and research trials and has been encouraged by regulatory agencies when evaluating novel treatments (6–8).
A PRO is defined as “any report of the status of a patient’s health condition that comes directly from the patient, without interpretation of the patient’s response by a clinician or anyone else” (8). Patients are best able to judge the severity of their symptoms, the impact of those symptoms on their physical, social, and psychological function, and how those symptoms and limitations impair their HRQoL (1). PROs can be quantified through the use of structured questionnaires termed PRO instruments and these tools should be appropriately designed and demonstrate adequate psychometric properties that provide an accurate representation of a patient’s health status.
Three primary uses of chronic HF PRO instruments include: 1) in clinical research, to evaluate treatment benefit; 2) for regulatory approval, to support a product label claim; and 3) in clinical practice, to monitor disease progression and assist in treatment decisions. Partially validated PRO measures may be suitable for some clinical practice or research settings, whereas regulatory approval requires fulfillment of rigorous recommendations (5–8). PRO instruments can be generic and widely applied or used to compare subjects across multiple medical conditions (9). However, disease-specific PRO measures are more sensitive to clinical change and more interpretable because they quantify symptoms specific to the disease of interest (10–12).
Because many PRO tools exist for use in patients with chronic HF, and their use is increasing, there is a need to delineate the properties of available instruments so providers and researchers can select those appropriately aligned with their needs (13–16). We therefore performed a targeted evaluation of available PRO measures used for the assessment of patients with chronic HF. We used the comprehensive recommendations from the U.S. Food and Drug Administration (FDA) guidance described in Table 1 to compare the PRO instruments, catalogued the incorporated symptoms and symptom impacts, and described the measurement properties that guide their use for regulatory approval, research, and clinical practice. Because the FDA guidance views PRO instruments as a mechanism to quantify treatment benefits in medical product clinical trials, we focused on more sensitive and interpretable HF-specific PRO measures rather than generic instruments (8).
Instrument search strategy
EMBASE and MEDLINE were searched in the period between January 2006 (the publication year for the draft FDA guidance) and May 2015 in iterative fashion (Figure 1), using search terms listed in Online Appendix, to identify PRO instruments administered to patients with chronic HF and their psychometric properties. Societal conferences and guidelines listed in the Online Appendix were also investigated. Manual searches were completed in the PROQoLID database and clinicaltrials.gov, and for specific missing information using Google.
Identified titles and abstracts were screened according to predetermined eligibility criteria by at least 2 independent reviewers (Online Appendix). Eligible articles reported instruments that were administered to patients with acute or chronic HF and included information on symptoms, symptom impacts, or therapeutic effect on symptoms. Articles were excluded that reported studies in other populations, did not mention HF symptoms or symptom-impacts, reported no new data, were nonempirical, failed to specify methodology, or were not published in English.
Using the FDA guidance as an evaluative framework, we extracted relevant development and validation information from eligible articles (8). Captured data for each PRO instrument included the published reference(s), study design, details of the subject population, symptoms and impacts measured, domains covered, administration format, recall period, and scoring methods. Validation variables included the methods used in instrument development such as qualitative interviews, literature reviews and stakeholder input for content validity. Captured psychometric properties included reliability, internal consistency, construct validity, responsiveness, responder definitions, and the minimal clinically important difference. Fulfillment by each PRO instrument of recommendations made in the FDA guidance was agreed on by at least 2 independent reviewers, with disagreements settled by the author group.
Nineteen PRO instruments used with chronic HF patients were identified by screening 2,552 articles and 2,334 conference abstracts (Figure 1). All PRO measures identified in conference abstracts and guidelines were also found by literature review. Every PRO tool was either initiated or completely developed prior to the release of the draft or finalized FDA guidance in 2006 and 2009, respectively (8).
Most of the selected PRO measures were developed with patients documented to have chronic HF (13 tools), whereas fewer were generated using patients with coronary artery disease (4 tools) or any cardiovascular disease (2 tools) who may also have had HF (Table 2). Most were validated employing HF patients with reduced left ventricular ejection fraction, however the Kansas City Cardiomyopathy Questionnaire (KCCQ) was also tested in patients with preserved left ventricular ejection fraction and advanced HF (17,18). The recall periods for identified PROs ranged from 1 day to 12 weeks. Six measures had ambiguous or undefined recall periods, and 2 utilized an FDA-preferred short (1 day) recall period for symptoms and not more than 1-week recall period for impacts, the European Heart Failure Self-care Behavior Scale (EHFScBS) and the Shortness of Breath in Heart Failure (SOB-HF) (19,20). All 19 PRO tools were administered on paper; 13 of the tools were self-administered and 5 could also be given by an interviewer. Estimated completion times for each PRO instrument ranged between 5 and 15 min. Administration times on the high end of the spectrum were found for the Chronic Heart Failure Questionnaire (CHQ), the only included PRO that required an interviewer, and those tools containing large numbers of response items such as the Chronic Heart Failure Assessment Tool (CHAT) (21,22).
The most commonly incorporated symptoms were shortness of breath, swelling, fatigue, and chest pain (Table 2). Less frequent, but used in more than 1 instrument, were nausea, lightheadedness, concentration or memory deficits, and weakness. Recurrent proximal symptom-impacts included physical activity and patient self-efficacy, and some measures focused on symptom-impacts rather than symptoms. The distal impact on HRQoL was distinctly measured in 4 assessments, the Cardiac Health Profile (CHP), KCCQ, Quality of Life Index (QLI), and Quality of Life Questionnaire for Severe Heart Failure (QLQ-SHF). The most common quantification system (14 measures) was a Likert scale, rarely combined with a visual analog scale. The Left Ventricular Dysfunction Questionnaire (LVD-36) used dichotomous true/false elements, the Daily Activity Questionnaire in Heart Failure (DAQIHF) captured activity duration, and the San Diego Heart Failure Questionnaire (SDHF) used multiple response and true/false questions. Scoring methods were heterogeneous.
At least 1 of the major components of FDA-defined content validity, including literature review, or expert or patient interviews, was not mentioned by 9 of the 19 PRO measures; although 8 purported to have performed all these components, they lacked documentation including the demographics and number of subjects included (Table 3). Among the PRO instruments reviewed, only 2 surveys had content validity likely acceptable to the FDA, the EHFScBS and SOB-HF. In contrast, measurement properties such as reliability and validity were more commonly reported. Test–retest reliability was reported for 11 measures assessed between 1 and 26 weeks after initial testing. Internal consistency was reported for 16 instruments. Some evidence of convergent validity was reported for all but 3 instruments, the Heart Failure Needs Assessment Questionnaire (HFNAQ), SOB-HF, and QLI, and was most often demonstrated by comparison to generic PRO measures. The most widely validated were the KCCQ and the Minnesota Living with Heart Failure Questionnaire (MLHFQ), which also exhibited independent predictive validity for hospitalization and death (23,24). Adequate responsiveness to a clinically significant change was recorded for 13 of the tools; however only 6 disclosed an anchor-based minimal clinically important difference, and only the KCCQ and MLHFQ delineated the anchored responder definition preferred by the FDA. The KCCQ equally detected both clinical worsening and improvement, however other PRO instruments were better at determining improvement rather than worsening (25). No identified PRO measure fulfilled all the FDA guidance criteria (Figure 2).
Although many PRO instruments exist for use in patients with chronic HF and despite their potential utility for clinical care, all those investigated lacked 1 or more pieces of psychometric evidence, test characteristic, or sufficient documentation of content validity suggested by the FDA (16). Thus, researchers interested in a product label claim approval are advised that current PRO measures may not satisfy FDA requirements, even though they might have clinically significant associations or be acceptable for other regulatory bodies. These conclusions are on the basis of extensive literature from the field of psychometrics and qualitative instrument development (1,7,8,26). Common deficiencies included incomplete or inadequately documented content validity, longer recall periods than desired by the FDA, and lack of responder definitions (8).
There are many PRO guidelines published by national and international organizations, including the European Medicines Agency and the International Society for Pharmacoeconomics and Outcomes Research; however, the FDA guidance is the most rigorous in terms of patient involvement to establish content validity and interpretability using responder definitions (5–8). The draft FDA guidance was released in 2006 and finalized in 2009, and from 2006 to 2010 sponsors of 45% of products applied for labels with data from PRO endpoints (27). Of new products approved by both the EMA and FDA from 2006 to 2010, PRO label claims were approved for 47% and 19%, respectively (28). All PRO claims approved by the FDA were also granted by the EMA, however tolvaptan was the only HF medication to have an application submitted during this period and it received a PRO label claim only from the EMA. The FDA appeared to prefer symptom and impact-related claims whereas the EMA endorsed claims on the basis of HRQoL and global assessments (28). These differences suggest that sponsors seeking international label claims should either utilize tools with both lower and higher order concepts or follow the FDA guidance with primarily symptom and impact assessment.
Understanding the inadequacies of the PRO instruments described here may be useful to help generate novel measures suitable for the FDA. Such measures could be useful for clinical practitioners, researchers, and for regulatory approval. To address the listed deficiencies a PRO tool should include appropriate responder definitions as well as incorporate and document the components required for content validity and saturation. Based on the current analysis concept saturation would likely include shortness of breath, swelling, and fatigue (Table 2), items also identified in standalone qualitative interviews of 63 chronic HF patients, along with the key impacts of physical and emotional function (29). Because of the presence of comorbid causes of pain such as ischemia, it remains unclear whether chest pain is an essential manifestation of HF (Table 2). The FDA prefers short recall periods because of concern for recall bias, as retrospective evaluation can underestimate prior health status and correlates best with current health status assessment (8,26,30). However, the construct validity of HF PRO tools with varied recall periods argues that this element of the FDA guidance may benefit from revision (Tables 2 and 3). Moreover, a recent comparison of the Seattle Angina Questionnaire with daily diaries suggested high agreement between a 4-week recall period and the daily records of patients’ angina (31).
Nevertheless, the strengths of currently available PRO instruments should not be lost to the decrees of the regulatory review process. There are few reasonably validated and feasible HF PRO measures, best represented by the KCCQ and MLHFQ, and nonregulatory use of these instruments should continue in clinical and research settings (3,22). The KCCQ and MLHFQ have demonstrated excellent construct validity, responsiveness to change, strong correlations with chronic HF symptoms, and after multivariable adjustment have reliable independent predictive power for traditional outcomes, including costs of care (10,23,32–34) (Table 3). The KCCQ and MLHFQ can measure efficacy of an intervention and these associations suggest they capture and quantify aspects of the disease process not otherwise accessed by clinical evaluation. Additionally, prediction of hospital readmission may rely as much or more on PRO-captured elements as on traditional clinical variables (3,10,35). Clinicians and researchers can employ these HF PRO measures to assist clinical decision making and risk prediction, determine efficacy, refine outcome probabilities, facilitate shared decision-making, identify deficits in self-care behavior, assess health disparities, and meaningfully assist care providers and patients to understand and improve their experience (36,37).
Validated and dependable instruments should replace the methods commonly used in clinical practice and research that poorly appraise HF patients’ health status (13,15). For example, perhaps because almost one-half of patients do not recognize dyspnea as a symptom of HF, PRO elements are incompletely reported without a formal PRO instrument (38). In contrast to evaluation by a PRO instrument, physician-determined New York Heart Association functional classification is inconsistent and incongruent with patient self-assessment, and though most cardiologists routinely query how far patients say they can walk, responses do not correlate with cardiopulmonary performance (39–41). Finally, only 16% of modern cardiovascular clinical trials utilized a PRO measure, 11% of which were ad hoc and lacked evidence of validity (15). An individual or group seeking to employ a PRO in practice or investigation can survey the compiled tables and determine which measure will best assess the desired concept in the subject or population of interest.
Still, despite their expansive evaluation as descriptive devices in multiple diverse settings, the role of PRO measurement as a clinical management tool remains undefined (16). Given that PRO instruments can distinguish patients at elevated risk for traditional outcomes, it is logical that they could be used to identify those patients who merit increased attention, more intensive therapy, or other change in management as part of a clinical trial. Such a trial might establish another way these PRO measures fit into the range of clinical practice and demonstrate the clinical utility of a low-cost intervention.
The present evaluation complements other assessments of the chronic HF PRO literature. A structured evaluation of HF PROs for the measurement of population-level health status identified the KCCQ and MLHFQ as the strongest HF-specific PRO instruments (42). A systematic review and meta-analysis of HF PRO measures between 1996 and 2006 included only the CHQ, KCCQ, LVD-36, MLHFQ, and QLQ-SHF, and did not evaluate suitability for regulatory approval (43). In that analysis the MLHFQ was the most validated option and the KCCQ had yet to be extensively investigated. Meta-analyses demonstrated that interventions with small expected effects such as exercise programs produced small score changes and interventions expected to produce large effects such as medications produced large score changes. An updated qualitative appraisal of the HF PRO literature was limited to only the CHAT, CHP, CHQ, KCCQ, LVD-36, MLHFQ, and QLQ-SHF (22). Using a standardized tool for evaluating PRO instruments the most highly rated were the KCCQ followed by the MLHFQ and CHQ. The CHQ was felt to be conceptually superior, LVD-36 the most reliable, and the KCCQ the most widely validated and interpretable with low administrator burden. Finally, a recent systematic assessment of HF-specific PRO instruments for implementation into clinical care concluded that the KCCQ and MLHFQ were the best suited because of their favorable psychometric properties, ease-of-use, and prognostic value (16).
Although this was an extensive review, it may have failed to identify evidence in alternative databases or languages, PRO instruments published prior to the selected time frame, or attributes of PRO measures not published in manuscripts or conference abstracts. Nevertheless, given the multiple organized search strategies it is unlikely that well-described HF-specific PRO tools were missed. We purposefully did not include generic PRO instruments due to their decreased sensitivity to clinical change. Due to space limitation, some PRO measures with extensive publication histories such as the KCCQ and MLHFQ were not comprehensively explicated here. Aspects of these PRO instruments like degree of translation into non-English languages were not systematically assessed, however many have been successfully translated and validated in multiple language formats, including the MLHFQ, KCCQ, and MacNew (43–45). Finally, other similar searches in multiple languages did not identify additional PRO measures (43).
No disease-specific PRO instrument utilized with chronic HF patients appears to currently satisfy all the FDA guidance, and none of those identified may merit an approved label claim. The FDA criteria and the findings here should guide development of future PRO measures that would facilitate comparison of results across clinical trials and practices, help evaluate treatment benefits, stratify patients, and support product label claims. Additionally, the clinical utility of PRO measurement as a management tool should be assessed in a clinical trial.
COMPETENCY IN MEDICAL KNOWLEDGE: Standardized PRO tools allow quantification of the chronic HF patient experience and multiple validated options exist for this purpose. Choice of specific PRO measure depends on the desired use.
TRANSLATIONAL OUTLOOK 1: Future studies are required to develop a PRO tool for use in chronic HF acceptable to the FDA and will require focus on the details of content validity and recall periods.
TRANSLATIONAL OUTLOOK 2: To further develop the field, a clinical trial using a PRO measure as the intervention is suggested to demonstrate utility as a therapeutic intervention.
The authors thank Julia R. Gage, PhD (on behalf of Amgen Inc.), for her assistance in preparation of this manuscript.
For expanded Methods and Reference sections, please see the online version of this article.
This study was sponsored by Amgen Inc. Dr. von Maltzahn is a former employee and Drs. Anatchkova and Wiklund are current employees of Evidera, which received funding from Amgen Inc. to conduct this study. Dr. Agodoa is a former employee and shareholder of Amgen, Inc. Dr. Chau is a current employee and shareholder of Amgen Inc. Dr. Malik is an employee and shareholder of Cytokinetics, Inc., which has collaborated with Amgen Inc. in the development of omecamtiv mecarbil. Dr. Patrick is a consultant to Amgen Inc. Dr. Spertus served as a consultant to Amgen Inc. for this study; is a consultant to Janssen, Amgen Inc., Novartis Pharmaceuticals Corporation, Bayer Healthcare, and Regeneron Pharmaceuticals, Inc.; has received grant support from Gilead Sciences, Inc., Eli Lilly and Co., the National Institutes of Health, American Heart Association, the Patient-Centered Outcomes Research Institute, and the Aetna Foundation; has equity interest in Health Outcomes Sciences; and owns the copyright to the Seattle Angina Questionnaire, Peripheral Artery Questionnaire, and Kansas City Cardiomyopathy Questionnaire. Dr. Teerlink has received grant/research support and is a consultant to Amgen Inc., Bayer, Cytokinetics, Inc., Novartis Pharmaceuticals Corp., Relypsa, and Trevena Inc. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.
- Abbreviations and Acronyms
- European Medicines Agency
- U.S. Food and Drug Administration
- heart failure
- health-related quality of life
- Kansas City Cardiomyopathy Questionnaire
- Minnesota Living with Heart Failure Questionnaire
- patient-reported outcome
- Received January 25, 2016.
- Revision received March 14, 2016.
- Accepted April 27, 2016.
- 2016 American College of Cardiology Foundation
- Rumsfeld J.S.,
- Alexander K.P.,
- Goff D.C. Jr..,
- et al.
- Bonow R.O.,
- Ganiats T.G.,
- Beam C.T.,
- et al.
- ↵European Medicines Agency. Reflection Paper on the Regulatory Guidance for the Use of Health-Related Quality of Life (HRQL) Measures in the Evaluation of Medicinal Products. 2005. Available at: http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2009/09/WC500003637.pdf. Accessed June 2, 2016.
- Rothman M.,
- Burke L.,
- Erickson P.,
- Leidy N.K.,
- Patrick D.L.,
- Petrie C.D.
- ↵U.S. Department of Health and Human Services. Guidance for Industry: Patient Reported Outcome Measures: Use in Medical Product Development to Support Label Claims. 2009. Available at: http://www.fda.gov/downloads/drugs/guidances/ucm193282.pdf. Accessed June 2, 2016.
- Sanders C.,
- Egger M.,
- Donovan J.,
- Tallon D.,
- Frankel S.
- Rahimi K.,
- Malhotra A.,
- Banning A.P.,
- Jenkinson C.
- Kelkar A.A.,
- Spertus J.,
- Pang P.,
- et al.
- Joseph S.M.,
- Novak E.,
- Arnold S.V.,
- et al.
- Heidenreich P.A.,
- Spertus J.A.,
- Jones P.G.,
- et al.
- DeMuro C.,
- Clark M.,
- Mordin M.,
- Fehnel S.,
- Copley-Merriman C.,
- Gnanasakthy A.
- Gwaltney C.J.,
- Slagle A.F.,
- Martin M.,
- Ariely R.,
- Brede Y.
- Arnold S.V.,
- Kosiborod M.,
- Li Y.,
- et al.
- Chan P.S.,
- Soto G.,
- Jones P.G.,
- et al.
- Ouwerkerk W.,
- Voors A.A.,
- Zwinderman A.H.
- Kosiborod M.,
- Soto G.E.,
- Jones P.G.,
- et al.
- Raphael C.,
- Briscoe C.,
- Davies J.,
- et al.
- Mackintosh A.,
- Gibbons E.,
- Fitzpatrick R.