Discussion
Our analyses suggest that in patients with progressive fibrosing ILDs other than IPF, the Dyspnoea and Cough domain scores from the L-PF questionnaire Symptoms module are responsive to changes in disease severity and in patients’ perceptions of their physical health and quality of life. We observed significant differences in changes in L-PF Dyspnoea and Cough scores between subjects who had a large deterioration in FVC % predicted versus those with stable FVC % predicted, and between subjects who experienced deterioration versus improvement in global assessment anchors.
There is no consensus on the best approach to estimating meaningful change thresholds for patient-reported outcomes.13 18 Food and Drug Administration guidance recommends that anchor-based approaches incorporate ‘patient ratings’ of change12; however, such transition items, which require patients to assess their current state, recall their prior state and mentally subtract the difference (eg, “Is your shortness of breath a lot better/the same/a lot worse?”), are fraught with problems. Ideally, the correlation between the transition item and baseline score is equal and opposite to the correlation between the transition item and the score at follow-up, but with recall periods of longer than 4 weeks, transition ratings tend to be (inappropriately) highly correlated with the patient’s current state.19 The two patient response anchors we used alleviated this potential for bias by asking patients to rate their state at baseline and at week 52; we then performed the subtraction to yield the transition item.
For many transition items, stability and degree of change are arbitrarily defined by the investigator. Some investigators may consider ‘somewhat worse/better’ to represent a minimal change, while others may consider ‘a bit worse/better’ or ‘minimally worse/better’ to be a minimal change. How patients interpret such descriptors, and how investigators categorise anchors, can affect estimates of meaningful change thresholds. For example, when using a 15-point quality of life transition item with ratings ranging from −7 to +7, ratings of −1 to +1 have been considered to represent no change and ratings of −3 to –2, +2 and +3 to represent minimally important changes,20 21 but meaningful change estimates may have been different if stability had been defined as a rating of 0 and minimally important changes as ratings of −2 to –1, +1 and +2. For our global rating anchors, we considered a change of 0 to represent stability and changes of −1 to –2, +1 and +2 to represent minimal/moderate change. Some patients with transition scores of 0 may have changed minimally and some with transition scores of 1 or 2 may have been stable. We attempted to account for this inherent uncertainty by using a half-way point approach rather than simply subtracting mean scores between groups of interest.
As patients with progressive ILDs are unlikely to experience improvement in disease status, in the ROC analyses, we identified a change threshold between worsening and stability/improvement. This approach aligns with the clinical behaviour of progressive ILDs and with current therapeutic approaches, which slow rather than reverse disease progression.
Change in FVC is used as a primary end point in clinical trials to assess the efficacy of treatments for ILDs.14 22–25 A decline in FVC is associated with mortality.2 26–28 While there is no established definition of ILD progression, absolute declines of >5% or >10% in FVC % predicted are widely regarded as indicating progression,26 28–30 although smaller declines may also be relevant. Scores from patient-reported outcomes that assess symptoms or HRQL typically correlate weakly with FVC in patients with ILDs,31–33 suggesting that these measures yield information unique from physiological measures of ILD severity. This suggests that although commonly used as an anchor in validation studies, FVC may not be a suitable anchor in all circumstances.
Strengths of our analyses include the use of a large and heterogeneous population of subjects with progressive fibrosing ILDs. The use of triangulation that incorporated both anchor-based and distribution-based approaches aligns with accepted methodology, including from regulatory bodies, but we acknowledge that distribution-based methods may overestimate meaningful change thresholds.34 Limitations include that the trial was not designed to evaluate the measurement properties of patient-reported outcomes, so additional metrics that could have been used as anchors were not included. For example, another cough-specific patient-reported outcome would have been a more appropriate anchor for the Cough domain. The content validity of the L-PF questionnaire has not been demonstrated for all the languages and cultures that participated in the trial. Whether our findings are applicable to patients with fibrosing ILDs beyond those who met the inclusion criteria for the INBUILD trial is unknown.
In conclusion, our analyses support the responsiveness of the Dyspnoea and Cough domains of the L-PF questionnaire Symptoms module as measures of symptom severity in patients with progressive fibrosing ILDs. Estimates of meaningful change thresholds in these scores may be of value in interpreting the effects of interventions in these patients. Additional analyses are encouraged to confirm or refine these findings.