Discussion
In this study, we examined items and domains of ATAQ-IPF for their performance between participants from the UK and USA, deleted poorly performing items and retained the remainder to generate a cross-Atlantic version of ATAQ-IPF, the ATAQ-IPF-cA. ATAQ-IPF-cA is a shorter questionnaire with very good measurement properties, including invariance across countries.
Items within each domain of ATAQ-IPF-cA had good fit to the Rasch model, verifying that clusters of items composing a given domain indeed tap a single construct. The IC reliability of each ATAQ-IPF-cA domain was at least as good as those for the original ATAQ-IPF, and in this two-country sample, exhibited minimal floor or ceiling effects. Further, as we had hoped, the retained 43 items span a broad range of severities across the HRQL scale. This means the ATAQ-IPF-cA should capture baseline and changes in HRQL in patients with IPF who have very poor or very good HRQL (and all levels in between).
It is well known that questionnaire items do not always function equally in different groups—such as responders from different countries. If that were the case, the item set for all groups analysed together would fail to meet criteria for the Rasch model, or items would demonstrate significant DIF. Thus, Rasch analysis provides an excellent method to test items and identify those that either require modification to be retained or that should be deleted to improve instrument performance. Thus, by using Rasch methodology, we were able to select items that generate the most precise measurement of HRQL (among the pool of items on ATAQ-IPF) and that meet a fundamental assumption of the Rasch model: that each item contributes reliably to the measurement of the single underlying construct, regardless of country location.
To a certain degree, IC of an item set (eg, those composing a domain) depends on item number—a greater number of items will inflate the IC coefficient. We were prepared to observe drops in the IC coefficients of domains as items were removed, but compared with α previously reported for ATAQ-IPF, those for ATAQ-IPF-cA were as high or higher. The construct validity of ATAQ-IPF-cA was supported by the numerous significant correlations (for both USA and UK subgroups) between domain scores and scores from other PROs that measure dyspnoea, the main driver of HRQL in patients with IPF. For D-12 scores, we observed the strongest correlations with ATAQ-IPF-cA total and SOB domain scores in both the UK and USA subgroups. This is not surprising since the D-12 has previously demonstrated excellent measurement properties in IPF.13 ,16
We removed three domains altogether: Sexuality, Relationships and Finances. The Sexuality domain was removed due to missing responses, and a particularly high number of responses of ‘not able to answer’. Chronic illness can have profound negative effects on relationships and sexual satisfaction of both patients and partners.17 The average age of our population was 70 years and not all participants were in a relationship with a significant other. These factors may have contributed to the response patterns observed in this study. Likewise, chronic illness can impact relationships between patients and their friends, and most assuredly, loved ones in the same household.18 The results of our analyses suggest that more work is needed to develop a tool that can precisely assess that impact among patients with IPF in different countries. Given the differences in the provision of healthcare and related finances between the UK and USA, it is not surprising to find differences in participant responses to items in the Finances domain. We found that participants from the UK were more likely to respond positively to finance-related items despite receiving free healthcare through the UK National Health Service. However, due to invariance in responses between the two countries this component was deleted.
The results of this study, while demonstrating cross-cultural validity of the ATAQ-IPF-cA, highlight the preferred option to develop questionnaires intended for international use in the target countries from the outset. This would enable the early detection of items with significant DIF and the ability to adapt an iterative process of checking for DIF and scale content during initial development as opposed to post hoc. However, such an approach would require significant resources which are not always available during the embryotic stages of instrument.
We found no other studies examining cross-cultural aspects of HRQL outcomes using DIF in IPF. As such it is not known whether the illness experience between patients with IPF in the USA and UK are different—we observed DIF in 11 ATAQ-IPF items so it can be assumed that previous international studies examining HRQL in IPF may have unwittingly included instruments that contain items that are violating the requirement of unidimensionality.19 Responses to a scale's items should only depend on the severity of HRQL and not on external factors, such as cultural background and, for example, healthcare provision.
This study has limitations. Owing to the absence of data, we were unable to examine correlations between pulmonary physiology values and ATAQ-IPF-cA scores in the UK subgroup. We were able to run these analyses in the USA subgroup, and as hypothesised, there were moderate correlations between pulmonary physiology values and the majority of ATAQ-IPF-cA domain scores. Participants were recruited from specialty clinics, so the results here may not be applicable to the more general population with IPF in either country. Given the lack of longitudinal data, we are unable to comment on the performance of the retained items. Although there were differences between groups in baseline characteristics, a basic tenet of Rasch analysis assures that items meeting Rasch model requirements contribute reliably to the measurement of the one underlying construct (here it is HRQL) in all respondents, regardless of underlying differences in health status or other variables.
In conclusion, we used a systematic, statistically based method to revise the original ATAQ-IPF and develop a version that is relevant to both USA and UK patient populations. The reliability and validity of the ATAQ-IPF-cA are acceptable and comparable to the original instrument. Prospective studies will determine whether the specificity of the m-ATAQ-IPF is responsive to underlying change in patients with IPF.