## Discussion

In this study, we have derived a set of robust computational models for survival prediction in MPM. To our knowledge, this is the first MPM study to use Lasso regression analysis, as recommended in the TRIPOD statement.10 In a test set of 169 cases, we defined a prognostic OS signature based on WCC, serum albumin, PS and age, and successfully validated this in a reserved set of 100 cases. We dichotomised the outcomes of this model to create <6-month and <12-month OS models. These incorporated the four original predictors and also assigned high predictor weights to epithelioid histology (both models), platelet count (<6-month model) and CRP level (<12-month model).

At validation, each model performed better than would be expected by chance, as indicated by 95% CI lower limits of D_{XY} above zero (for model 1) and AUC values above 0.5 (for models 2 and 3). However, the overall predictive value of each model was relatively poor. This is best reflected by the quantitative D_{XY} score, which was only 0.221 (0.0935–0.346) in the validation set, suggesting that the concordance between the observed and predicted survival outcomes was only 22% better than would be expected by chance. Similarly, for the <6-month and <12-month OS models, the observed sensitivities and specificities for each (<6 months: 74% sensitivity and 68% specificity, <12 months: 63% sensitivity and 79% specificity) are insufficient to be of reliable clinical value given the potential impact of adverse survival predictions. These might include advising a patient against an attempt at palliative chemotherapy or involvement in a clinical trial, and would frequently result in considerable emotional distress. Future Lasso regression models, incorporating much denser MPM phenotyping (eg, genomic data and volumetric tumour imaging), should seek to exceed these metrics to deliver clinically useful prognostic tools. Ideally these would deliver highly individualised survival predictions, such as those recently reported in breast cancer and melanoma.21

### Model composition and comparison with previous studies

Our primary OS signature assigned high weights to four predictor variables: WCC, serum albumin, PS and age. The Lasso method penalises inclusion of large numbers of predictor inputs and signatures are minimised as part of the process. Within our analyses, retention of additional variables beyond these four proved to be of no discriminative advantage (see figure 1A). These four key predictors were retained in the dichotomised outcome models regarding 6-month and 12-month survival, but these benefited from additional retention of histological subtype (epithelioid reducing the probability of death) and a measure of systemic inflammation, which increased the probability of death (as platelets in model 2 (survival <6 months) and CRP level in model 3 (survival <12 months), see table 2). The content of these signatures is generally concordant with previous MPM studies, which have consistently demonstrated the prognostic impact of age,22 PS,8 albumin,23 WCC,20 epithelioid subtype,24–27 CRP28–30 and platelets.31 Our models also closely resemble the two best validated MPM prognostic scores, the CALGB score7 and the EORTC score.8 The concordance of our results, which are based on unselected registry data analysed using Lasso regression, with these studies, which involved highly selected clinical trial populations and were analysed using different statistical methods, emphasises the apparently universal prognostic importance of WCC, serum albumin, PS, age and histological subtype.7 8 32

Our models also closely resemble the Brims model, in which the key prognostic variables were PS, serum albumin and histological subtype, weight loss and haemoglobin (Hb) concentration.9 We did not select Hb as a potential candidate predictor for the current study because the prognostic impact of Hb levels had been contradictory in MPM studies which have reported negative,33 positive34 and no prognostic association with thrombocytosis.35 In the current study, integrated measures of systemic inflammation, such as NLR, PLR and mGPS, appeared less prognostically important than some previous studies have suggested.13 14 Meta-analyses in lung and other cancers have also previously suggested that socioeconomic factors are associated with less access to treatment,34 increased comorbidity and poorer outcomes.36 37 Similar studies in MPM have been inconclusive38 39 and we failed to identify deprivation as a major prognostic factor in this study. We included aspirin use as a potential candidate predictor given the potential link between cyclo-oxygenase biology40 and MPM survival and the HMGB1 pathway.16 However, we found no evidence of a clinically important prognostic effect. Serum and pleural fluid biomarkers (eg, mesothelin) are not routinely used in MPM as they offer no reliable prognostic information41 and were not considered here.

Subsequent chemotherapy administration was not included as a candidate predictor since this was not a baseline factor. Of note, only 67/269 patients (24.9%) received chemotherapy over subsequent follow-up, contrasting significantly with previous prognostic model studies (61.4%–100% of patients received chemotherapy in the Brims,9 EORTC7 and CALGB studies8). In a previous Dutch registry series, increased age was associated with decreased chemotherapy use.42 The mean age in our cohort (73 years) was higher than in the Dutch series (68 years) and age may have been a factor in the chemotherapy rate reported. However, median age in recent English national audit data (75 years) was similar to ours and chemotherapy use was higher (36.5%).43 It therefore appears highly likely that other factors are involved.

### Model performance and comparison with previous studies

In the recent study reported by Brims *et al*,9 which used decision tree analysis, the C-statistic was used to assess model performance (validation C-statistic: 0.68 (95% CI 0.60 to 0.75)). This value is numerically equivalent to the AUC score20 used here to describe the performance of the dichotomised models for <6-month and <12-month survival (validation AUC 0.74 (0.638–0.836) and 0.794 (0.688–0.883), respectively), and similar to the censoring-adjusted C-statistic used here to assess our primary OS signature (validation C-statistic 0.6106 (0.5468–0.673)). These performance metrics are broadly similar and are consistently below the AUC/C-statistic threshold (>0.8) generally required of a strong survival model.20 The performance of the EORTC and CALGB scores cannot be directly compared with the currently reported models because the primary metrics used to describe these were HRs, reporting the relative risk of death between different risk groups.

Based on these comparable performance metrics, the Lasso regression models reported here therefore appear to offer similar prognostic performance to previous models and are based on many of the same predictors. The uniquely quantitative value of D_{XY} demonstrates that the routinely available clinical data used to define these models are fundamentally unable to describe the bulk of the variability in survival outcomes seen in real patients. This is reflected in a validation D_{XY} value for our primary OS signature of only 0.221, which equates to only a 22% improvement in concordance between the observed and predicted survival outcomes than would be expected by chance.

### Methodological considerations and clinical applicability

Both the decision tree analysis and the multivariate logistic regression are prone to model overfitting.44 This may lead to poor model performance in external, new patient groups and limits the clinical utility of predictive modelling approaches in general. Lasso regression, combined with an appropriate cross-validation methodology, alleviates some of the problems of model overfitting45 and can be more readily upscaled to deal with more deeply phenotyped descriptor data. This makes this technique uniquely suited to future prediction modelling in MPM incorporating these additional predictors. However, Lasso regression is associated with complex outputs and requires important data processing steps to analyse new data within the finalised model. We sought to overcome this by creating dichotomised outcome models predicting the probability of survival at 6 and 12 months, but a relatively simple electronic or web-based program would still be required to translate input predictor values into results interpretable to clinicians. However, this need not be developed until a model with sufficient precision and accuracy has been defined.

### Study limitations

This study involved retrospective data collection for some of the variables, although many were prospectively collected as outputs from a regional mesothelioma MDT. Nevertheless, this design introduces potential recall and omission bias. The latter might be important since the cases were identified from a pathology archive department; therefore frail patients in whom a histological diagnosis was not pursued will not have been included. In addition, the validation performed used an internal cohort and further external validation is required to confirm the generalisability of the models created. Our analysis is also limited by a significant number of cases with missing data for some variables. The influence of these missing data was minimised by imputation and exclusion of variables with too many missing variables (eg, fluid LDH).

### Conclusions and future studies

Prognostic models are being increasingly used in medicine for investigating patient outcome in relation to patient and disease characteristics. Such models should have a sound statistical and clinical validity, rely on a limited number of objective parameters and be generalisable to a heterogeneous group of patients.45 Most studies describing the natural history and prognostic factors for MPM antedate accurate pathological diagnosis, optimal staging22 and a range of emerging predictors, including genomic data. This study suggests that routinely available clinical data alone are insufficient to accurately predict prognosis in MPM. The computational models defined here are suitable for expansion and upscaling using genomic data and other predictors, for example, including volumetric imaging results.