Discussion
A PRO measure should fulfil accepted criteria for reliability, validity and responsiveness to change. Its threshold for meaningful change is useful for interpreting the impact of interventions.17 The SGRQ has been used previously to assess HRQL in patients with IPF, but this is the largest longitudinal dataset available that has been used to test its reliability, validity and meaningful change in this patient population.18
We analysed the psychometric properties of the SGRQ using pooled data from 1061 patients with IPF treated with nintedanib or placebo in the INPULSIS trials. Internal consistency was high for the SGRQ total and domain scores, confirming homogeneity of items within domains and for the entire instrument. The lower internal consistency of the symptoms domain is consistent with previous studies5 19 20 and may be due to certain items within the symptoms domain (eg, those related to sputum and wheezing) lacking clinical relevance for many patients with IPF. In patients whose disease was clinically stable over 6 weeks, SGRQ total and domain scores all showed acceptable stability. In patients whose disease was clinically stable by all criterion variables over 52 weeks, the SGRQ total score had acceptable stability, but there was variability in the stability of domain scores, likely due to measurement error, which attenuated validity coefficients. The low ICC for the EQ-5D VAS is likely due to the generic nature of the instrument, which by definition is intended to capture all aspects of health in addition to IPF.
The content validity of the CASA-Q cough domains and UCSD-SOBQ has been demonstrated in patients with IPF, with patients reporting that the items in these questionnaires were relevant to their symptoms.21 In our data, moderate to strong cross-sectional correlations were observed between SGRQ total and domain scores and the UCSD-SOBQ and CASA-Q cough domains, supporting the SGRQ as an instrument capable of measuring aspects of disease relevant to patients with IPF. Longitudinal correlations, based on changes in these PROs over 52 weeks, were moderate and in the expected directions, thus further supporting validity. These data are consistent with recent findings from the Australian IPF registry22 and the German INSIGHTS-IPF registry,23 which demonstrated that dyspnoea and cough were strongly correlated with SGRQ total and domain scores at baseline.
SGRQ total and domain scores clearly distinguished patients in the highest and lowest quartiles of FVC % predicted at baseline and patients who were using or not using supplemental oxygen at baseline. Correlations between SGRQ scores and FVC % predicted were generally weak. This is consistent with the findings of previous studies in patients with IPF, including patient registries and the Phase II TOMORROW trial of nintedanib, in which correlations between the SGRQ total and domain scores and FVC % predicted at baseline ranged from −0.11 to −0.15.5 22 23 Such weak correlations align with previously reported findings that factors other than lung function, such as patients’ general health, comorbidities, mood disturbance, energy level and independence, may have an important impact on HRQL in patients with IPF.22–26 However, despite the less-than-strong correlations, analyses showed that changes in SGRQ total, activity and impact scores were sensitive to detecting change in patients with >10% improvement or deterioration in FVC % predicted. This finding is consistent with pooled data from patients with IPF treated with bosentan or placebo in a Phase III study, in which changes in SGRQ scores were able to distinguish patients whose disease status declined, improved or remained stable based on changes in FVC % predicted, DLco % predicted and dyspnoea over a 6-month period.20 In our analyses, changes in SGRQ scores were also sensitive to change in patients who reported improvement or deterioration based on the PGI-C.
A change in SGRQ total score of 4 points is generally accepted as clinically meaningful in patients with COPD.3 27 Collectively, our analyses suggest that a change in SGRQ total score of approximately 4 to 11 points over 52 weeks may be clinically meaningful in patients with IPF, but it is not possible to draw a firm conclusion given the ambiguity of the ROC analyses. These preliminary results are, however, similar to findings from the placebo-controlled Phase III study of bosentan, which suggested that a change in SGRQ total score of 7 points over 6 months was meaningful.20 We recognise that all the anchors we used have limitations. The UCSD-SOBQ was strongly correlated with the SGRQ, but data on its meaningfulness in patients with IPF are scarce. In a study conducted in 164 patients with various chronic lung diseases, a change in UCSD-SOBQ of 5 points was a reasonable estimate of clinically meaningful change.28 In another analysis of data from 180 patients with IPF and advanced lung function impairment enrolled in a placebo-controlled trial of sildenafil, results suggested that a point estimate of 8 (range 5–11) represented a meaningful change in the UCSD-SOBQ.29 We recognise the limitation of estimates based on the PGI-C, given the potential issues with recollection of change over a long-time period. Finally, although change in FVC is a well-established endpoint in trials of IPF,30 31 the weak correlations with the SGRQ in our analysis suggest that it is not an ideal anchor to inform the meaningfulness of the SGRQ. Our finding that changes in FVC >5% predicted resulted in clinically meaningful changes in SGRQ scores is consistent with data showing a correlation between FVC changes of similar magnitude and outcomes such as hospitalisations and mortality,12 32–35 but we acknowledge that a pooled analysis of data from the TOMORROW, INPULSIS, CAPACITY and ASCEND trials did not find a statistically significantly increased risk of death in patients who had a decline in FVC ≥5–<10 predicted compared with those who had a decline in FVC <5% predicted.30 The present analysis supports using a change of 4–5 points as a starting point for responder threshold analyses of the SGRQ total score in patients with IPF. However, given the range of estimates in this analysis and an indication of greater thresholds in the Phase III study of bosentan,20 we recommend conducting further sensitivity analyses.
This study had several strengths. First, with the exception of analyses of responder thresholds and stability/reliability analyses conducted at week 6 and using change in FVC ≤2% predicted as a criterion, all analyses were prespecified. Second, data were from two large prospective controlled trials. Third, the 52-week treatment duration provided a long follow-up period over which to assess the stability of scores. The main limitation of our analyses was the timing of the visits at which the PROs were assessed, which restricted certain validity tests. In particular, the PGI-C was only measured at week 52, and patients may have found it challenging to recall their health status over such a prolonged period. Another limitation was the exclusion of patients with severely impaired lung physiology (FVC <50% predicted).
At the start of recruitment into the INPULSIS trials in 2011, the SGRQ was considered the most suitable tool for use in this patient population. Subsequent data from clinical trials and patient registries support its utility in patients with IPF.5 22 23 Additional tools have been developed specifically for use in patients with ILD (eg, the King’s Brief Interstitial Lung Disease (K-BILD) questionnaire36) or IPF (eg, A Tool to Assess Quality of Life in IPF (ATAQ-IPF(-cA)) questionnaire37 38 and Living with IPF (L-IPF) questionnaire39). The SGRQ has also been adapted for use with patients with IPF, the SGRQ-I,19 but analyses are needed to determine its performance characteristics. It is encouraging that these and other instruments are in development.40 Future studies in patients with IPF should focus on evaluating these questionnaires and their measurement properties. Given its brevity, the K-BILD may be a particularly suitable instrument for use in clinical trials, where the need to collect comprehensive information needs to be balanced with the burden of filling out questionnaires. In such studies, the SGRQ may serve as a useful anchor and platform for development of more specific tools to collect patient-reported information.
In conclusion, the psychometric properties of the SGRQ observed in the INPULSIS trials support the use of this instrument as a measure of HRQL in patients with IPF. Additional research is needed to test the threshold estimates for deterioration and improvement in SGRQ total score over 52 weeks and over shorter intervals, including qualitative interviews with patients and clinicians regarding what they consider a meaningful outcome of treatment.