Methods
We generated a study protocol using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols (PRISMA-P) guidelines and registered it on Open Science Framework on 17 April 2023: https://osf.io/49n68/. We subsequently prepared this manuscript using 2020 PRISMA guidance.8 9
Search strategy
We developed a comprehensive search strategy with the help of an experienced medical librarian. We searched Embase, Medline, Cochrane CENTRAL and Web of Science for eligible trials and cohorts from inception to 17 April 2023. We also reviewed previous SR addressing the topic to ensure no studies were missed.1 7 We did not use any language restrictions and included only primary source clinical trial data. We reviewed secondary analyses and post-hoc analyses for subgroup data, as required. Online supplemental eTable 1 presents the search strategy.
Eligibility criteria
We included randomised controlled trials (RCTs) that randomised adult patients (≥18 years old) with a diagnosis of fILD of any aetiology, except sarcoidosis, using any previous diagnostic criteria, to treatment with corticosteroids versus standard care or placebo or no treatment. We also included both retrospective, prospective and mixed cohort studies. We extracted in-study subgroup data if studies reported on both non-fibrotic and fILD patients. If there was no in-study subgroup data, we determined whether to include studies if at least 80% or more of the patients included had fILD. We excluded studies investigating the effectiveness of corticosteroids on acute exacerbations as well as trials predominately investigating non-fILD.
Our two outcomes of interest included change in force vital capacity (FVC) (%-predicted or mL) and all-cause mortality. For both outcomes, we collected data at the longest follow-up or closest to 52 weeks.
Study selection and data extraction
We used COVIDENCE to screen eligible trials.10 Pairs of reviewers, following training and calibration exercises to ensure sufficient agreement, worked independently and in duplicate to screen titles and abstracts of search records and subsequently the full texts of records that were determined potentially eligible at the title and abstract screening stage. Reviewers resolved discrepancies by discussion or, when necessary, by third party adjudication. Similarly, the reviewers worked independently and in duplicate to extract data from eligible trials, and resolved discrepancies by discussion or, when necessary, by third party adjudication.
We collected data on trial characteristics (author, year published, trial registration, country of enrolment), patient characteristics (age, sex, ethnicity, comorbidities, C reactive protein, white cell count, proportion of patients on home oxygen, with previous exacerbations and aetiology of their fibrotic disease), intervention characteristics (type of corticosteroid, dose, duration and baseline treatments) and outcomes of interest.
For dichotomous outcomes, we extracted the number of participants analysed and number of events in each arm. For cohort studies, we collected OR, HRs or relative risks (RR) with event rate. For continuous outcomes, we collected data on mean difference (MD) and SD. When studies report other measures of variability other than SD, we converted them to SD using methods proposed by Hozo et al.11
Risks of bias
We planned to assess the risk of bias for individual RCTs using the Cochrane tool (RoB 2.0).12 13 For cohort studies, we used ROBINS-I to assess for risk of bias.14 When rating the mortality outcome, for a study to be rated at low risk of bias for confounding, they would need to at least adjust for age, sex, smoking status, cointerventions with antifibrotic and other immunomodulators, pretreatment FVC (either in mL or as %-predicted), and duration of disease. For change in FVC (%), a study would have corrected for age, sex, baseline lung function and cointerventions.
Statistical methods
For all outcomes, we performed a random effects MA with the restricted maximum likelihood heterogeneity estimator. We summarised the effects of interventions using RR for dichotomous outcomes and MD for continuous outcomes, both with associated 95% CIs. For FVC, as some studies reported either mL or % predicted, we used standardised MDs as a sensitivity analysis in addition to analysing the predominant measure (% predicted). To facilitate interpretation of dichotomous outcomes, we calculated absolute risk differences per 1000 patients and corresponding 95% CIs. For studies that only reported HRs, we converted them to RR using available total number and event rates. In the absence of these data, we used methods by Short et al to convert HR to RR using established baseline risks.15
We planned to also perform a dose–response analysis for mortality using methods proposed by Orsini and Longnecker; however, there were insufficient data to perform these analyses (see protocol).
We assessed heterogeneity by inspection of forest plots, the I2 statistic and the χ2 test. We considered heterogeneity ranging from 0% to 40% as potentially unimportant, 30% to 60% as moderate heterogeneity, 50% to 90% as substantial heterogeneity and 75% to 100% as critical heterogeneity.16 For outcomes with 10 or more studies, we assessed for publication bias or small study effects using both visual inspection of funnel plots and the Egger’s test.17
We performed all analyses using STATA V.18.
A priori subgroup analysis
We planned subgroup analysis for the following moderators: IPF versus non-IPF fILD, high-dose versus low-dose steroid, where high was defined as methylprednisolone-equivalent of >1 mg/kg, and steroid molecule (ie, prednisone vs dexamethasone). We hypothesised that there would be no difference in these subgroups. For randomised trials, we assessed the credibility of statistically significant subgroups using the Instrument for assessing the Credibility of Effect Modification Analyses (ICEMAN) tool.18 For cohort studies, we applied similar principles in lieu of a validated tool.
Certainty of the evidence
For all outcomes, reviewers, working independently and in duplicate, assessed the certainty of the evidence using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach.19 20 We judged the certainty for each outcome as high, moderate, low or very low, based on considerations of risk of bias, inconsistency, indirectness, imprecision and publication bias.
To make judgements regarding imprecision, we used a minimally contextualised approach, which considers only whether CIs include a minimally important effect and does not consider the magnitude of plausible effects, captured by confidence intervals.21 For mortality, we used a minimal clinical important difference (MCID) based on consensus of the authors and considered any difference important. For change in FVC, we used the MCID provided by the updated ATS/ERS/JRS/ALAT (American Thoracic Society/European Respiratory Society/Japanese Respiratory Society/Asociación Latinoamericana de Tórax (Latin American Thoracic Association)) Clinical Practice Guideline of 5%.22
We described our results using guidance from the GRADE Working Group, based on the certainty of evidence and the magnitude of the effect (eg, corticosteroids reduce mortality (high certainty), corticosteroids probably reduce mortality (moderate certainty), corticosteroids may reduce mortality (low certainty) and the effect of corticosteroids on mortality is very uncertain (very low certainty)).20