Critical Care

Development and validation of a predictive model for pulmonary infection risk in patients with traumatic brain injury in the ICU: a retrospective cohort study based on MIMIC-IV

Abstract

Objective To develop a nomogram for predicting occurrence of secondary pulmonary infection in patients with critically traumatic brain injury (TBI) during their stay in the intensive care unit, to further optimise personalised treatment for patients and support the development of effective, evidence-based prevention and intervention strategies.

Data source This study used patient data from the publicly available MIMIC-IV (Medical Information Mart for Intensive Care IV) database.

Design A population-based retrospective cohort study.

Methods In this retrospective cohort study, 1780 patients with TBI were included and randomly divided into a training set (n=1246) and a development set (n=534). The impact of pulmonary infection on survival was analysed using Kaplan-Meier curves. A univariate logistic regression model was built in training set to identify potential factors for pulmonary infection, and independent risk factors were determined in a multivariate logistic regression model to build nomogram model. Nomogram performance was assessed with receiver operating characteristic (ROC) curves, calibration curves and Hosmer-Lemeshow test, and predictive value was assessed by decision curve analysis (DCA).

Result This study included a total of 1780 patients with TBI, of which 186 patients (approximately 10%) developed secondary lung infections, and 21 patients died during hospitalisation. Among the 1594 patients who did not develop lung infections, only 85 patients died (accounting for 5.3%). The survival curves indicated a significant survival disadvantage for patients with TBI with pulmonary infection at 7 and 14 days after intensive care unit admission (p<0.001). Both univariate and multivariate logistic regression analyses showed that factors such as race other than white or black, respiratory rate, temperature, mechanical ventilation, antibiotics and congestive heart failure were independent risk factors for pulmonary infection in patients with TBI (OR>1, p<0.05). Based on these factors, along with Glasgow Coma Scale and international normalised ratio variables, a training set model was constructed to predict the risk of pulmonary infection in patients with TBI, with an area under the ROC curve of 0.800 in the training set and 0.768 in the validation set. The calibration curve demonstrated the model’s good calibration and consistency with actual observations, while DCA indicated the practical utility of the predictive model in clinical practice.

Conclusion This study established a predictive model for pulmonary infections in patients with TBI, which may help clinical doctors identify high-risk patients early and prevent occurrence of pulmonary infections.

What do we already know about this topic?

  • Among 1780 patients with traumatic brain injury (TBI), approximately 10% (186 cases) developed secondary pulmonary infections.

  • Pulmonary infections in patients with TBI are associated with longer intensive care unit (ICU) length of stay and higher mortality rates.

  • Patients with pulmonary infections have higher heart rates, respiratory rates, temperatures, blood glucose and are more prone to acute kidney injury, congestive heart failure (CHF), chronic obstructive pulmonary disease and renal disease.

What does this study add?

  • This study highlights that pulmonary infections significantly worsen the survival rates of patients with TBI, especially within the first 14 days of ICU admission.

  • The research identifies independent risk factors for pulmonary infections in patients with TBI, such as race, respiratory rate, temperature, mechanical ventilation, antibiotic use and CHF.

  • A nomogram model was developed and validated to predict the risk of pulmonary infections in patients with TBI, demonstrating good discriminative and calibration capabilities.

How will this study impact research, practice, or policy?

  • The findings underscore the importance of early identification and management of pulmonary infections to improve outcomes in patients with TBI.

  • The nomogram model can be used in clinical practice to assess the risk of pulmonary infections and guide physicians in optimising treatment plans.

  • Research findings may prompt policy-makers to focus on high-risk factors for pulmonary infections in patients with TBI, driving implementation of relevant prevention and treatment measures.

Introduction

A penetrating head injury or an impact, blow or shake to the head can result in traumatic brain injury (TBI).1 An estimated 64–74 million individuals worldwide experience TBI each year, making it one of the top causes of death and disability, according to research from the Centers for Disease Control and Prevention.2 TBI brings a heavy personal and societal burden because of its high prevalence, long-term impact hazards, workforce loss, pressure on healthcare systems and effects on family burden, social involvement and health inequality concerns.3 In the population of patients with TBI, approximately 10%–15% of individuals require specialised nursing care, primarily due to severe TBI, which has been identified as one of the major causes of neurological dysfunction.4 5 Critically ill patients with TBI may experience severe systemic complications, including autonomic dysfunction, arrhythmias and pulmonary infections.6

After severe TBI, mortality is caused not only by direct brain damage but also by pulmonary oedema and bacterial infections.7 Specifically, TBI increases the risk of nosocomial pneumonia secondary to neuronal deficits, including altered mental status, difficulty swallowing, impaired cough reflex and inability to clear secretions.8 Additionally, several prospective and retrospective studies have investigated the prognosis of patients with TBI, finding that the incidence of pneumonia after TBI is approximately 35%–50%.9–11 Furthermore, according to research reports, for each additional day of mechanical ventilation after TBI, the risk of pneumonia increases by 7%.10 Severe TBI is a known independent risk factor for developing systemic inflammatory response syndrome, with reported mechanisms including cytokine release, high-mobility group box 1 protein release and lymphatic system activation.8 These studies indicate an increased risk of pulmonary infections following TBI. Given these pathological associations, early prediction, diagnosis and intervention for secondary pulmonary infections in patients with TBI can help reduce the risk of such infections and potentially improve targeted care for patients with TBI, thereby enhancing patient survival and quality of life.

However, it is worth noting that currently there is a lack of relevant research on the risk prediction of pulmonary infections in patients with TBI, and there has been no systematic study on the prediction of pulmonary infection risk specifically among critically ill patients with TBI in the intensive care unit (ICU). Nomograms are graphical tools based on statistical prediction models used to calculate the precise probability of specific endpoints (such as disease progression or death) for individual patients.12 13 They are not only effective risk stratification tools commonly used in clinical practice but also integral components of modern medical decision-making. In this study, we aim to establish a nomogram based on the Medical Information Mart for Intensive Care IV (MIMIC-IV) database to integrate multiple independent risk factors, thereby better predicting the risk of pulmonary infection in patients with TBI in the ICU. This will aid in further individualising patient treatment and providing support for the development of evidence-based effective prevention and intervention strategies.

Methods

Medical Information Mart for Intensive Care IV

This retrospective observational study used comprehensive data from MIMIC-IV, collected between 2008 and 2019 at Beth Israel Deaconess Medical Center’s (BIDMC) ICU. The MIMIC-IV database contains multidimensional clinical information of admitted ICU patients, including physiological parameters, laboratory tests, medical interventions, medication records and so on (https://physionet.org/content/mimiciv/2.2/). Data collected by BIDMC were deidentified, transformed and made available to researchers who completed human research training and signed data use agreements. The Institutional Review Board at BIDMC granted a waiver for informed consent and approved the sharing of research resources.14 Accessing this database involved completing required courses and application processes, passing specified exams and obtaining appropriate data access permissions.

Patient and public involvement

The MIMIC-IV data used in this retrospective analysis are accurate medical data that can be accessed for free. All personal information in the database has been deidentified, replaced with random codes instead of patient identifiers, ensuring anonymity. As such, publicly available databases do not require patient-informed consent or ethical approval.

Patient selection

Patients with intracranial injuries were identified in the MIMIC-IV (version 2.2) database using International Classification of Diseases (ICD) codes ICD-9:85 and ICD-10:S06.14 3942 patients with intracranial injury admitted to the ICU were screened. Patients who met any of the following criteria were excluded: (1) age <18 or >90 years old; (2) not the first admission to the ICU; (3) ICU stay less than 1 day and (4) death within 3 days after ICU admission. 1780 eligible patients were included in this study and randomly assigned to training set and development set in a 7:3 ratio. The patient selection process is plotted in figure 1.

Figure 1
Figure 1

Flow chart of selection. ICU, intensive care unit; MIMIC-IV, Medical Information Mart for Intensive Care IV.

Data collection

We extracted demographic information, vital signs, severity scores, laboratory indicators, treatment information and comorbidities as variables from MIMIC-IV database. Demographic information included age, gender, race and marital status. Vital signs included heart rate (HR), mean systolic blood pressure, breath rate and temperature. Severity scores included Sequential Organ Failure Assessment score and Glasgow Coma Scale (GCS). Laboratory indicators included saturation of peripheral oxygen (SpO2), blood glucose concentration, anion gap, haematocrit, chloride concentration, bicarbonate concentration, haemoglobin, platelet count, potassium concentration, partial thromboplastin time, international normalised ratio (INR), blood urea nitrogen, prothrombin time, sodium concentration, white blood cell count, red blood cell distribution width, mean erythrocyte haemoglobin, mean corpuscular haemoglobin concentration, mean corpuscular volume and creatinine level. Treatment information included whether antibiotics were used and whether vasopressors (norepinephrine, epinephrine, phenylephrine, dopamine and vasopressin) were used within 24 hours after admission to the ICU and lasted longer than 48 hours,15 whether to receive mechanical ventilation and whether to receive renal replacement therapy. Comorbidities include congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD), acute kidney injury (AKI) (kidney injury as defined by the kidney disease: improving global outcomes (KDIGO) guidelines16), kidney disease and liver disease. Vital signs and laboratory indicators data were collected within 24 hours of patients’ admission to the ICU. For those indicators with multiple measurements, the value corresponding to the most severe condition was selected as the recorded data.

Clinical outcomes

The primary clinical outcomes we observed in our study were ICU length of stay (LOS) and ICU mortality rate.

Pulmonary infection

Patients with pulmonary infection were identified in the MIMIC-IV database using ICD codes: ICD-10: J12-18 and ICD-9: 480–486.

Statistical analysis

The median and IQR were used to represent continuous variables, while categorical data were expressed as percentages (%). The differences between continuous variables were tested using Wilcoxon-Mann-Whitney test, and comparison of categorical variables was done using χ2 test. A two-sided p<0.05 is considered statistically significant. The samples were randomly allocated to training set and development set in a 7:3 ratio. A univariate logistic regression model was built in training set to identify potential factors associated with adverse outcomes. Further, a multivariate logistic regression model was built using Akaike information criterion for stepwise regression to select independent predictive factors. Variables with p<0.1 were included in the final model. Kaplan-Meier curve was used to show changing trend of survival probability at different time points. The Landmark analysis was employed to assess survival differences between groups before and after different time points. The nomogram model was constructed based on multivariate logistic regression analysis results, and the model was validated in training set and development set. Receiver operating characteristic (ROC) curve and calibration curve were plotted using 1000 resamples to evaluate predictive performance of model. Hosmer-Lemeshow test was implemented to assess the goodness of fit of the model, and decision curve analysis (DCA) was to evaluate the clinical value of nomogram. Statistical analysis was done on R (V.4.2.3) software. The R packages used included mice,17 tableone,18 jskm (https://cran.r-project.org/web/packages/jskm/vignettes/jskm.html), glm,19 regplot (https://cran.r-project.org/web/packages/regplot/index.html), rms (https://cran.r-project.org/web/packages/rms/index.html), Hmisc (https://cran.r-project.org/web/packages/Hmisc/), pROC (https://cran.r-project.org/web/packages/pROC/index.html) and rmda (https://cran.r-project.org/web/packages/rmda/index.html). In this study, variables with missing values accounting for more than 20% of the total sample were excluded from vital signs and biochemical indicators. Other missing variables were handled using the Random Forest method in the ‘mice’ package. The ‘regplot’ package was applied to plot nomogram and output the risk scores of predictive factors.

Results

Baseline characteristics

Among 1780 patients with TBI, approximately 10% (n=186) had secondary pulmonary infections. The median age of the total sample was 68.0 years (IQR: 52.0–80.0), and most patients were white (65.8%) and male (63.8%). Differences in baseline characteristics of samples between the pulmonary infection group and the non-pulmonary infection group were shown in table 1. The ICU LOS and mortality rate in the lung infection group were 6.88 days (IQR: 2.91–12.11) and 11.3%, respectively, significantly higher than those in the non-lung infection group, which were 2.65 days (IQR: 1.65–4.89) and 5.3% (p<0.05). HR, breath rate, temperature and blood glucose of pulmonary infection group were higher, and they were more prone to AKI, CHF, COPD and kidney disease (p<0.05).

Table 1
|
Participants characteristics of included patients stratified by pulmonary infection

The impact of pulmonary infection on patient survival

The Kaplan-Meier curve illustrated that pulmonary infection had an adverse effect on the survival of patients with brain injury (log-rank p<0.05) (figure 2A). Then, the 7-day and 14-day time points after admission to ICU were selected for analysis. Patients with pulmonary infection showed significant survival disadvantage at 7 and 14 days (log-rank p<0.001, figure 2B,C). No significant difference was seen in survival curves between groups after 21 days (log-rank p>0.05) (figure 2D). In summary, early occurrence of pulmonary infection in patients with brain injury was severely detrimental to their survival.

Figure 2
Figure 2

Kaplan-Meier curves of survival probability before and after the landmark time grouped by pulmonary infection. (A) Kaplan-Meier curves before the landmark time. (B–D) Kaplan-Meier curves after the landmark time set as 7, 14 and 21 days, respectively. The above and bottom log-rank p refer to values of Kaplan-Meier curves before and after the landmark time, respectively.

Logistic regression variable screening results and nomogram establishment

Risk factors for pulmonary infection were studied through training set and development set, and there was no significant intergroup difference (p>0.05, online supplemental table 1). In training set, results of univariate logistic regression analysis and potential risk factors for pulmonary infection in multivariate logistic model were listed in table 2. Multivariate logistic analysis revealed that other races (OR: 1.783, 95% CI 1.172 to 2.702, p=0.007), breath rate (OR: 1.100, 95% CI 1.037 to 1.166, p=0.001), temperature (OR: 2.107, 95% CI 1.419 to 3.173, p=0.002), mechanical ventilation (OR: 2.802, 95% CI 1.814 to 4.323, p<0.001), antibiotics (OR: 2.797, 95% CI 1.707 to 4.747, p<0.001) and CHF (OR: 2.895, 95% CI 1.686 to 4.908, p<0.001) were independent risk factors for pulmonary infection in patients with TBI. Since INR served as an important indicator of coagulation system activity and GCS score could reflect consciousness status, they were selected as evaluation indicators for nomogram prediction model. In conclusion, based on variables such as race, breath rate, temperature, GCS, INR, mechanical ventilation, antibiotics, CHF and kidney disease, the nomogram model was built to predict risk of pulmonary infection in patients with TBI (figure 3).

Figure 3
Figure 3

Nomogram for predicting probability of pulmonary infection in participants. Cyan box sizes indicate relative proportion differences among subgroups, while the grey density plot displays total points distribution. GCS, Glasgow Coma Scale; INR, international normalised ratio.

Table 2
|
Univariate and multivariate logistic regression analyses in the training set

Validation of the nomogram model

Area under the ROC curve of nomogram in training set was 0.800 (95% CI 0.761 to 0.840) (figure 4A), and it was 0.768 (95% CI 0.705 to 0.832) in development set, indicating that nomogram had favourable discrimination ability (figure 4B). The value of the calibration C index for nomogram was 0.784. Hosmer-Lemeshow test showed that p values in training set and development set were 0.318 and 0.826, respectively, indicating a high goodness of fit for model. Calibration curve depicted that predictions of nomogram model in training set (figure 5A) and development set (figure 5B) were consistent with actual results. DCA results presented that in training set (figure 6A) and development set (figure 6B), intervention strategy guided by the nomogram model generated higher clinical utility.

Figure 4
Figure 4

The receiver operating characteristic (ROC) curve of the nomogram for both training (A) and development (BB) sets, with consistent variable entries. AUC, area under the ROC curve.

Figure 5
Figure 5

Calibration curves for nomograms in the training set (A) and development set (B). The diagonal line represents perfect prediction by an ideal model. The red and green lines correspond to the initial cohort and bias corrected by bootstrapping (B=1000 repetitions), respectively.

Figure 6
Figure 6

Decision curve analysis (DCA) curves for nomogram in both training set (A) and development set (B). The horizontal line denotes the scenario where no participants develop pulmonary infection, and the grey oblique line represents those who develop pulmonary infection. The red solid line corresponds to the pulmonary infection risk nomogram. The horizontal line reflects the absence of sample intervention with a net benefit of 0, while the red solid line indicates universal intervention receipt.

Discussion

In this retrospective cohort study, we constructed a predictive model based on nomogram to assess risk of pulmonary infection in patients with TBI in ICU during ICU admission. Research findings revealed that in this population, race, mechanical ventilation, antibiotics, CHF, kidney disease, GCS, temperature, breath rate and INR were key predictive factors for pulmonary infection. Therefore, this study provided clinicians with an effective tool to identify individuals at high risk of pulmonary infection among patients with TBI in ICU. Furthermore, validation of the model demonstrated good performance, solidifying its reliability in practical clinical applications.

In patients with TBI, pulmonary complications and associated respiratory distress are considered one of the most common and life-threatening extracranial effects.20 21 About one-third of moderate and severe patients with TBI develop acute lung injury, manifested as bilateral shadows on pulmonary imaging and respiratory failure within 7 days after onset.22–24 Neuronal and cellular processes, including as high mobility group box 1 release, cytokine release and lymphatic system involvement, are strongly linked to lung injury induced by TBI. These mechanisms may decrease systemic and pulmonary immunity and raise the risk of infection.8 25 In immunology, the brain is thought to be nominally independent, but in reality, it interacts with other organs.25 For instance, when microglia and astrocytes cause inflammation in the brain, neutrophils become activated and adhere to the blood-brain barrier, disrupting it. TBI also causes a surge in neutrophils and inflammatory cytokines, such as tumour necrosis factor α, interleukin (IL)-1 and IL-6, to accumulate in the pulmonary air spaces.26–28

This study demonstrated that mechanical ventilation constituted a significant contributing factor to pulmonary infection in patients with TBI in ICU. Severe brain injury may trigger inflammation and affect tolerance of patients to mechanical stress generated by subsequent mechanical ventilation.29 In mechanical ventilation, the operator provides respiratory support by adjusting tidal volume, positive end-expiratory pressure, respiratory rate and inspiratory airway pressure. However, inappropriate application of these parameters may lead to lung tissue damage, associated with unfavourable patient prognosis.30 High VT ventilation may induce alveolar overexpansion, inflammatory mediator spillage and ventilator-associated pneumonia.31 The increase in this risk is partly due to tracheal intubation or tracheotomy, which allows bacteria from the oral cavity and upper respiratory tract to enter the lower respiratory tract, increasing the risk of pulmonary infection.32 Furthermore, pulmonary infections in patients are associated with antibiotic use. Non-absorbable antibiotics have been linked to intestinal dysbiosis, which might decrease broad immune cell responses and have a negative clinical impact on Pseudomonas aeruginosa lung infections.33 In a Streptococcus pneumoniae infection model, antibiotic-induced intestinal dysbiosis caused continuous impairment of macrophage function and was associated with adverse outcomes.34

In terms of comorbidities, heart failure can lower immunological function, which weakens the body’s defences against infections, particularly those caused by lung bacteria.35 Reduced cardiac pumping capacity in CHF causes blood stasis in the veins, which increases lung moisture and increases the risk of bacterial infection by congesting the pulmonary veins and capillaries.36 Proinflammatory cytokine levels in the serum are also higher in individuals with AKI, suggesting a strong relationship between the two conditions,37–40 which may directly damage pulmonary endothelial cells, leading to non-cardiogenic pulmonary oedema and lung injury.41–43 In adult patients undergoing mechanical ventilation in the ICU, low GCS score (GCS score <8) is an independent predictor factor for mixed bacterial ventilator-associated pneumonia.44 GCS is also involved in our predictive nomogram.

In this nomogram, temperature, breath rate and INR were also served as predictors to assess a patient’s risk of developing pulmonary infection. Fever and other hyperthermic states cause a rise in core temperature, which is a potent biological response modifier with profound but unpredictable consequences, especially in critically ill patients.45 In lipopolysaccharide and hyperoxia-induced acute lung injury models, febrile hyperthermia exposure is linked to significant increase in neutrophil infiltration, thus facilitating occurrence of pneumonia.46 47 In clinical practice, breath rate is frequently used as a screening tool for lower respiratory tract infections. The guideline defines tachypnoea as a breath rate greater than 20 breaths/min (beats/min) and recommends further evaluation.48 A study suggests that it may be feasible to distinguish between those who test positive for COVID-19 and those who have symptoms but test negative for the virus based on breath rate variability.49 INR may be associated with the haemostatic balance (antithrombotic–profibrinolytic) within the alveoli. In acute lung injury and fibrotic lung diseases, this balance significantly tilts towards procoagulant and inhibition of fibrinolysis, leading to the accumulation of fibrin in the extravascular alveoli and the formation of a hyaline membrane, which is a characteristic feature of acute lung injury/acute respiratory distress syndrome (ALI/ARDS).50

This study established and validated a predictive model for lung infection risk in ICU patients with TBI, which holds significant clinical implications. First, survival curve results demonstrated a detrimental impact of lung infection on the survival of patients with TBI, emphasising the importance of early identification and management of lung infections. Second, the study identified independent risk factors contributing to lung infection, such as race, respiratory rate, temperature, mechanical ventilation, antibiotic use and CHF, providing valuable therapeutic insights for clinicians. By identifying relevant risk factors, healthcare providers can implement preventive measures and treatment strategies to reduce the incidence of lung infections and improve patient outcomes. Additionally, the predictive nomogram serves as a practical tool for risk stratification and decision-making in clinical practice. Healthcare providers can use the nomogram to calculate individualised risk scores for patients with TBI, aiding in early identification and focused intervention for high-risk patients. We must admit, nonetheless, that this study has certain shortcomings. Since this is a retrospective cohort study, the research findings may be impacted by the use of exclusion processing for variables in vital signs and biochemical indicators when the missing value percentage exceeds 20% of the total sample size. Furthermore, in order to confirm the robustness and efficacy of nomogram, future research based on our own data would require external validation, as we have only carried out internal validation using this database. Finally, certain important factors, such as C reactive protein and cytokine level data, were left out of the analysis because of the restricted variety of variables in the public database.

In summary, the predictive nomogram developed in this study fills a gap in clinical practice in the field, providing clinicians with an effective tool for predicting the risk of secondary pulmonary infection in patients with TBI in the ICU. Early identification of high-risk patients and targeted interventions can improve patient outcomes and alleviate the burden of pulmonary infection in susceptible populations. However, further research and validation are needed to confirm the utility and applicability of this nomogram.