Article Text

Long COVID exhibits clinically distinct phenotypes at 3–6 months post-SARS-CoV-2 infection: results from the P4O2 consortium
  1. Jelle M Blankestijn1,
  2. Mahmoud I Abdel-Aziz1,2,
  3. Nadia Baalbaki1,
  4. Somayeh Bazdar1,
  5. Inés Beekers3,
  6. Rosanne J H C G Beijers4,5,
  7. Lizan D Bloemsma1,
  8. Merel E B Cornelissen1,
  9. Debbie Gach4,5,
  10. Laura Houweling1,6,
  11. Sebastiaan Holverda7,
  12. John J L Jacobs3,
  13. Reneé Jonker1,
  14. Ivo van der Lee8,
  15. Paulien M A Linders1,
  16. Firdaus A A Mohamed Hoesein9,
  17. Lieke C E Noij1,
  18. Esther J Nossent1,
  19. Marianne A van de Pol1,
  20. Daphne W Schaminee1,
  21. Annemie M W J Schols5,10,
  22. Lisanne T Schuurman4,5,
  23. Brigitte Sondermeijer8,
  24. J J Miranda Geelhoed11,
  25. Joop P van den Bergh4,12,
  26. Els J M Weersink1,
  27. Yolanda de Wit-van Wijck1 and
  28. Anke H Maitland-van der Zee13,14
  29. on behalf of the P4O2 consortium
  1. 1Department of Pulmonary Medicine, Amsterdam UMC Locatie AMC, Amsterdam, The Netherlands
  2. 2Department of Clinical Pharmacy, Assiut University Faculty of Pharmacy, Assiut, Egypt
  3. 3ORTEC, Zoetermeer, Zuid-Holland, The Netherlands
  4. 4Department of Respiratory Medicine, NUTRIM School of Nutrition and Translational Research in Metabolism, Maastricht University Medical Centre+, Maastricht, The Netherlands
  5. 5Universiteit Maastricht School of Nutrition and Translational Research in Metabolism, Maastricht, The Netherlands
  6. 6Department of Environmental Epidemiology, Utrecht University Institute for Risk Assessment Sciences, Utrecht, The Netherlands
  7. 7Longfonds, Amersfoort, Utrecht, The Netherlands
  8. 8Department of Pulmonology, Spaarne Gasthuis, Haarlem, The Netherlands
  9. 9Department of Radiology, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
  10. 10Department of Respiratory Medicine, NUTRIM School for Nutrition, Toxicology and Metabolism, Maastricht University Medical Center+, Maastricht, The Netherlands
  11. 11Department of Respiratory Medicine, Leiden University Medical Center, Leiden, The Netherlands
  12. 12Department of Internal Medicine, VieCuri Medical Centre, Venlo, The Netherlands
  13. 13Department of Respiratory Medicine, Amsterdam UMC, Amsterdam, The Netherlands
  14. 14Department of Pediatric Respiratory Medicine, Emma Childrens' Hospital UMC, Amsterdam, The Netherlands
  1. Correspondence to Jelle M Blankestijn; j.m.blankestijn{at}


Background Four months after SARS-CoV-2 infection, 22%–50% of COVID-19 patients still experience complaints. Long COVID is a heterogeneous disease and finding subtypes could aid in optimising and developing treatment for the individual patient.

Methods Data were collected from 95 patients in the P4O2 COVID-19 cohort at 3–6 months after infection. Unsupervised hierarchical clustering was performed on patient characteristics, characteristics from acute SARS-CoV-2 infection, long COVID symptom data, lung function and questionnaires describing the impact and severity of long COVID. To assess robustness, partitioning around medoids was used as alternative clustering.

Results Three distinct clusters of patients with long COVID were revealed. Cluster 1 (44%) represented predominantly female patients (93%) with pre-existing asthma and suffered from a median of four symptom categories, including fatigue and respiratory and neurological symptoms. They showed a milder SARS-CoV-2 infection. Cluster 2 (38%) consisted of predominantly male patients (83%) with cardiovascular disease (CVD) and suffered from a median of three symptom categories, most commonly respiratory and neurological symptoms. This cluster also showed a significantly lower forced expiratory volume within 1 s and diffusion capacity of the lung for carbon monoxide. Cluster 3 (18%) was predominantly male (88%) with pre-existing CVD and diabetes. This cluster showed the mildest long COVID, and suffered from symptoms in a median of one symptom category.

Conclusions Long COVID patients can be clustered into three distinct phenotypes based on their clinical presentation and easily obtainable information. These clusters show distinction in patient characteristics, lung function, long COVID severity and acute SARS-CoV-2 infection severity. This clustering can help in selecting the most beneficial monitoring and/or treatment strategies for patients suffering from long COVID. Follow-up research is needed to reveal the underlying molecular mechanisms implicated in the different phenotypes and determine the efficacy of treatment.

  • COVID-19

Data availability statement

Data are available upon reasonable request. The data are not publicly available due to agreements made by the consortium, that only allow access by each consortium partner to specific data that answers their prespecified research questions. A request for access to data by organisations outside of the consortium can be submitted to the P4O2 Data Committee (via and the research will need to be performed in collaboration with one of the P4O2 consortium partners.

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • 22%–50% of COVID-19 patients still experience complaints after 4 months.

  • Long COVID is a heterogeneous disorder and patients react differently to the same treatment.


  • Three distinct long COVID clusters were discovered based on patient characteristics/history, long COVID presentation, acute COVID-19 and lung function. These clusters are described and characterised in detail.


  • Clustering based on easily obtainable information and tests could support in selecting the best monitoring and treatment strategy for the individual patients.


Since 2019, over 750 million cases and 6.9 million deaths worldwide are described because of COVID-19 caused by the SARS-CoV-2.1 In addition to the impact on physical health, the COVID-19 pandemic has had an enormous impact on mental health and the economy.2 3 Infections with SARS-CoV-2 can range from asymptotic to severe, and complaints can include fever, headache, fatigue, cough, pneumonia and dyspnoea.4 However, after the initial infection, it is estimated that 22%–50% still suffer from complaints after 4 months.5

Just like acute COVID-19, the presentation of patients with long COVID is highly heterogeneous.6 7 With these differences in disease manifestations, a one-size-fits-all management plan is not sufficient to treat all patients under the term long COVID. Unsupervised clustering is a way to group entities together with similar features without the need for training labels. Applying clustering to the patient population of long COVID can group together those patients with similar clinical presentation and potentially similar underlying molecular disease pathologies. Consequently, placing patients in one of these clusters would aid in selecting the optimal personalised monitoring and/or treatment strategy for that patient and improve their recovery. Performing high-throughput tests to find those similar molecular mechanisms in a clinical care setting is however not achievable due to the economic costs and manpower required. Thus, there is a need to cluster patients according to their clinical presentation and by using easily performable tests and questionnaires.

Clustering of long COVID patients has been applied previously.8–11 However, these studies vary widely in the data used, time since initial infection, number of clusters and cluster characteristics. Mostly, these clusters are based solely on the long COVID symptoms and patient characteristics, sometimes with additional data about the acute infection. In the Precision Medicine for more Oxygen (P4O2) COVID-19 study12 we expanded on this information by adding questionnaires about the impact, severity and consequences of the disease and adding lung function tests to non-invasively gain more information. In addition, we performed CT imaging and lab work for cluster interpretation. Finally, we also collected biological samples to complete follow-up experiments and discover the molecular mechanisms underlying the disease phenotypes.

In this study, we aim to perform clustering in the P4O2 COVID-19 cohort to investigate whether patients with long COVID exhibit distinct clinical phenotypes. This is achieved using easily obtainable information, such as clinical presentation, medical history, questionnaires and lung function testing.


Study design and patients

The P4O2 COVID-19 study is a multicentre observational study with the objective of identifying therapeutic biomarkers, personalised medicine or lifestyle interventions for the prevention and treatment of long COVID using a multi-omics approach.12 For this study, 95 patients were included in five hospitals across the Netherlands at 3–6 months after SARS-CoV-2 infection. Inclusion criteria were: (1) confirmed SARS-CoV-2 infection, (2) aged 40–65 years, (3) post-COVID-19 outpatient clinic appointment, (4) understanding of the Dutch language and (5) ability to provide informed consent. Patients were excluded if they were terminally ill or involved in another study with investigational or marketed products within 4 weeks prior to study inclusion.

Patient involvement

Patient representatives have participated during P4O2 consortium meetings to discuss the results and progress of the project and the implications for the patients. Furthermore, the patient advisory board of the department has been regularly updated regarding the status of the project.

Clinical assessment

During the study visit, patients were asked to provide general information, perform a lung function test and fill in multiple questionnaires. Different demographic and clinical characteristics were collected from the patients or medical records including information such as their sex, age and body mass index (BMI), their medical history regarding asthma, chronic obstructive pulmonary disease (COPD), diabetes and cardiovascular disease (CVD) and finally information about their initial SARS-CoV-2 infection in terms of whether they were hospitalised, hospital duration, oxygen supplementation, WHO severity classification, and whether they suffered from a pulmonary embolism or thrombosis during hospitalisation. The dominant virus type was determined by the SARS-CoV-2 variant most abundant in the Netherlands at the week of the main infection, as determined by the Rijksinstituut voor Volksgezondheid en Milieu.

During this visit, patients also performed a lung function test consisting of spirometry, measurement of the diffusion capacity of the lung for carbon monoxide (DLCO), and underwent a CT scan. Spirometry consisted of the forced expiratory volume within 1 s (FEV1), forced vital capacity (FVC) and the Tiffeneau index (FEV1/FVC). FEV1, FVC and DLCO were used as percent predicted based on sex, BMI, ethnicity and age. For the FEV1 and FVC, the metric was considered abnormal if the percent predicted fell below 90%. For the FEV1/FVC and DLCO, a threshold of 70% (predicted) was used. CT scans were examined by a radiologist in the local hospital.

Questionnaires about the severity of symptoms and impact on daily life were also provided to the patients. These included the Fatigue Severity Scale (FSS),13 Patient-Reported Outcomes Measurement Information System (PROMIS),14 Primary care PTSD Screen for DSM-5 (PC-PTSD-5),15 EuroQoL 5D-5L (EQ5D)16 and the Checklist for Cognitive Consequences after an ICU admission (CLC-IC) (adapted version of CLCE-2417). In addition, the Utrecht Scale for Evaluation of Revalidation-Participation (USER-P)18 and the Hospital Anxiety and Depression Scale (HADS)19 questionnaires were administered. Finally, patients were questioned about their symptoms during the first visit and the first monthly questionnaire at home. Complaints were then summarised into the following categories: fatigue, respiratory, gastrointestinal, neurological, cardiovascular and other.

Statistical analysis

The data used for clustering is summarised in online supplemental table S1. These variables were chosen for their ability to describe the long COVID in a non-invasive manner using easily attainable information and testing. The specific comorbidities chosen to be included due to being either respiratory related, or by having a prevalence such that this variable can contribute to the clustering (≥15 patients). CT scans were not used for clustering due to their radiation risk when applied to a clinical setting, and in this cohort only used for interpretation. The USER-P and HADS questionnaires were not used because of high correlation with other questionnaires and to limit reliance of the clustering on solely questionnaires. The remaining missing data were imputed using Multiple Imputation by Chained Equation as implemented by the mice R package (V.3.14.0).20 Unordered categorical variables were imputed using logistic regression, ordered categorical variables were imputed with a proportional odds model and numerical variables were imputed using predictive mean matching. To account for uncertainty due to missing data, 100 different complete imputed data sets were generated for clustering.

For each imputed data set, the pairwise distance between patients was calculated with the Gower distance for mixed data types. A hierarchical dendrogram was constructed using the Ward.D2 construction method in the hclust function from the cluster package (V.2.1.2).21 This method was used because of the visual feedback in terms of the cluster separation and deterministic results, in contrast to partitioning around medoids (PAM). Based on a visual inspection of the dendrogram, all dendrograms were divided into three separate clusters. The resulting clustering was saved for each data set, and the Gower distance was applied to get a similarity index over all different clustering solutions. Identical to the individual data sets, a dendrogram was constructed and cut to create three distinct phenotypical clusters. Robustness of the clustering was assessed by applying PAM as a clustering method on the Gower distance in a similar manner as performed above, where similarity between the PAM and hierarchal clusters was determined with the rand index.

Patient characteristics were compared between the different clusters using statistical tests based on the distribution and type of the variable. Categorical variables were examined with a Fisher’s exact test. Numerical variables were assessed using an analysis of variance, if normally distributed by visual inspection, otherwise the Kruskal-Wallis test was used. Post-hoc tests of significant results were performed using pairwise Fisher’s exact tests or pairwise Wilcoxon rank-sum tests. All statistical tests were two-tailed. A Benjamini and Hochberg correction was applied to account for multiple testing, where an adjusted p value below 0.05 was considered statistically significant. All analyses were performed in R (V.4.1.2) using RStudio (V.2021.09.1+372).22


P4O2 COVID-19 cohort can be divided into three phenotypically similar clusters

The patient characteristics can be found in table 1. The patient population had an even distribution with regards to sex (49.5% female), had an average age of 54.1 (SD=6.2) years and was mostly overweight or obese with an average BMI of 30.5 (SD=5.3). The most common symptom categories after 3–6 months included respiratory symptoms (78.9%), neurological symptoms (70.5%) and fatigue (69.5%). Based on visual inspection of dendrograms from individual imputed datasets (online supplemental figure S1), the 95 patients from the P4O2 COVID-19 cohort were divided into three clusters of 42, 36 and 17 patients. Cluster separation can be seen in a t-SNE plot in figure 1.

Table 1

Baseline characteristics of the P4O2 COVID-19 cohort

Figure 1

t-SNE plot showing the separation of the clusters projected on a 2-dimensional space.

Clusters showed differences in sex and patient comorbidities

The three clusters showed a similar distribution in age (table 2). However, there was an imbalance in the distribution of sex in the clusters. Cluster 1 contained predominantly female patients (92.9%), while the other two clusters contained predominantly males (83.3% and 88.2%). Patients from cluster 1 had a slightly higher BMI compared with the other clusters. However, this failed to reach statistical significance (31.7 compared with 29.9 and 28.5, p=0.13). Patients in cluster 1 also showed a significantly higher rate of asthma (29.3%) compared with cluster 2 (11.1%) and 3 (0.0%), while also showing lower rates of CVD (12.5% compared with 41.7% and 41.2%). Cluster 3 contained relatively more patients with pre-existing diabetes with 47.1%, which was significantly more than 2.4% in cluster 1 and 16.7% in cluster 2. In the entire cohort, these comorbidities were slightly correlated with sex (asthma: p=0.03, CVD: p=0.07, diabetes: p=0.09).

Table 2

Baseline characteristics of the P4O2 COVID-19 cohort with cluster separation

Clusters showed differences in the number of symptom categories per patient, lung function and long COVID severity

The symptom categories per patient in each cluster are summarised in figure 2. This figure shows that patients in cluster 1 suffered from relatively more symptoms over different categories (median of 4 symptom categories), while patients in cluster 2 suffered from symptoms in a median of 3 categories. Finally, patients in cluster 3 experienced symptoms from a median of 1 category. Patients in cluster 1 showed predominantly fatigue (97.6%) and respiratory symptoms (97.6%), additionally, these patients also showed a high rate of neurological (81.0%) and gastrointestinal symptoms (50%); in cluster 2 patients suffered mostly from respiratory (86.1%) and neurological symptoms (75%) and fatigue (63.9%); patients in cluster 3 showed mostly neurological symptoms (35.3%). In terms of lung function, we found a significantly reduced FEV1 and DLCO for patients in cluster 2, while the FVC just slightly failed to reach statistical significance (p=0.052).

Figure 2

Heatmap depicting the presence or absence of rest complaints in the symptom categories or lung function, divided into the clusters constructed in this study. This heatmap shows that patients in cluster 2 suffer from relatively more symptom categories, while patients in cluster 4 suffer from relatively few symptom categories. In addition, cluster 3 shows relatively more abnormalities in lung function compared with other clusters.

Based on the questionnaire results, cluster 3 scored significantly better in terms of fatigue (FSS), physical, mental and social well-being (PROMIS), self-care (EQ5D), cognitive consequences after an ICU admission (CLC-IC), participation (USER-P) and depression (HADS) compared with both other clusters.

No significant differences separating the clusters in other variables

We were unable to find differences between the clusters during the acute phase of SARS-CoV-2 infection. However, 9 out of 10 patients that were not hospitalised during this phase were all placed in cluster 1. For the WHO severity classifications this however lacked the power to give a statistically significant result (p=0.063). No statistically significant differences between the clusters were found in terms of the dominant virus type at the time of infection, vaccination status, smoking status and education level. In terms of rest abnormalities found on CT-imaging, airtrapping was significantly more common in cluster 1.

Clusters showed moderate stability in regards to clustering method

When PAM clustering was applied to the Gower distance instead of hierarchal clustering, the different clustering methods showed a rand index of 0.62. While patients in the second and third clusters often stay in the same cluster they already belonged, patients from the hierarchal cluster 1 were more distributed over PAM cluster 1 and 2. This is also visible in online supplemental figure S2 depicting a t-SNE plot of this cluster separation. Hierarchal clustering is a better fit to this data, showing more defined clusters and better cluster separation than PAM clustering.


In this study, we aimed to cluster the patients suffering from long COVID in the P4O2 COVID-19 cohort into similar clinical phenotypes using easily obtainable information. Cluster 1 consisted of predominantly females with slightly higher BMI and pre-existing asthma. They suffered from complaints in a median of four symptom categories, with most commonly fatigue and respiratory and neurological symptoms. Patients in this cluster also showed a milder acute SARS-CoV-2 infection, and showed signs of airtrapping on a CT scan more often. Patients in cluster 2 were predominantly male with pre-existing CVD. They suffered from a median of three symptom categories, with most commonly fatigue and respiratory symptoms. They also showed a significantly reduced FEV1 and DLCO. Patients in cluster 3 were also predominantly male with pre-existing CVD and diabetes. They showed significantly fewer symptoms, with having a median of one symptom category. They also scored significantly better on nearly all questionnaires.

In our clustering, we found that cluster 1 was predominantly female with severe long COVID, however, with a milder acute infection compared with the other clusters. Research has indeed shown that being male is a risk factor for severe COVID-19,23 while risk factors for long COVID include being female.24 This same pattern was also discovered in other clustering efforts.8–10 In addition, it has also been reported that pre-existing asthma is a risk factor for developing long COVID.24 We already found that in our cohort, 17% of patients suffer from asthma, this is higher than the prevalence of asthma in the Dutch population at 6%.25 In the most severe cluster, asthma was even more common at 29% of patients, suggesting a potential link between asthma and long COVID severity, potentially mediated by a reduction in asthma control after SARS-CoV-2 infection.26 27

Several factors involved in the severity of acute COVID-19 were not found statistically different between the clusters. These include: vaccination status, smoking status and the SARS-CoV-2 virus type. In literature there has been conflicting information about the impact of vaccination of long COVID, with studies showing no impact of vaccination on the development of long COVID, while other studies showed a reduced risk of long COVID after vaccination.28 In addition, getting vaccinated after already developing long COVID also showed no influence on long COVID severity, as found in a study where 17% of patients showed a decrease in severity, while in 21% the severity increased.29 We were also not able to draw any conclusions about the impact of smoking on the clustering, this is potentially a problem regarding power, as there are only four current smokers in our cohort. Many patients in our cohort were former smokers, with the times since they quit ranging from 11 to 480 months. Not being able to make conclusions about the dominant virus type is also a result of low power. Research has shown that patients infected with the omicron variant are less likely to develop long COVID, and suffer from fewer symptoms compared with those infected with the delta variant.30 We did not see enough with either virus type to obtain enough power to make these conclusions.

This study was not the first to perform clustering on a long COVID patient population. Similarly to other clustering efforts,8 10 11 we found that three clusters best describe the long COVID population. Equally, these studies all distinguish a cluster that showed fewer symptoms compared with the other clusters and contains more males. Interestingly though, this same cluster is marked by having fewer comorbidities than the other clusters in.10 11 In our mild cluster this was not a defining feature, with diabetes even being the most common comorbidity in this cluster. There has been conflicting literature about the relation between diabetes and long COVID,31 however, there does seem to be evidence that diabetes increases the risk for developing long COVID. One reason for the discrepancy between the comorbidities and cluster severity might be the correlation between comorbidities and sex in our cluster. As our clusters are heavily dependent on sex, this might have influenced the distributions of comorbidities as well. Other studies showed the distinction between the other clusters either based on further severity or several symptoms, however, we found an important difference between cluster 1 and 2 to be in the lung function instead, where cluster 2 showed a markedly lower lung function compared with clusters 1 and 3. The other studies did also show hints for a cluster with a more respiratory axis at 3 months after infection. A larger study with only two clusters found shortness of breath more prevalent in one their clusters,32 while in Fischer et al,11 shortness of breath was significantly more common in their moderate severity cluster compared with the severe cluster. However, none of these studies confirmed this further with pulmonary function testing.

Besides the cluster of respiratory complaints and low lung function in cluster 2 and gastrointestinal complaints in cluster 1, we were not able to describe clusters based on particular symptom patterns besides quantity. One reason for this could be the classification of symptoms. Here the symptoms were grouped into systemic categories instead of using the presence of each symptom. This has both advantages and disadvantages. While we do lose the patterns of symptoms within each category, our method allowed the clustering to not be dominated by many symptoms from a single category. In addition, while a systematic approach like in Reese et al9 helps describing each person, it will result in sparse data which complicates clustering. Our method does not require specific symptoms to be questioned and can be applied more easily to other populations.

The strength of this study lies in the scope of information that we collected. This allowed us to view the patients in more detail, such as lung function and CT scans. Taking lung function into account distinguished between cluster 1 and cluster 2. However, due to the resources we require from patients, our sample size is relatively small compared with other long COVID clustering studies.8–11 Because of this, we did not have enough power to make conclusions about vaccination status, CT abnormalities besides airtrapping and level of education, where we do see potential differences between the clusters. Due to the nature of the study, we do not have information from before the SARS-CoV-2 infection. Consequently, we did not know whether lung function or radiological abnormalities were already present, potentially providing a bias for our clustering (lung function) or interpretation of the clustering.

Long COVID is a heterogeneous disease and here we clustered those patients into phenotypically similar clusters based on information that is easily obtainable from the patient characteristics, medical history, clinical presentation, questionnaires and non-invasive tests. We discovered clusters that differ in severity of initial SARS-CoV-2 infection, long COVID characteristics and symptoms, sex distribution, lung function and comorbidities. These clusters using easily obtainable information in a clinical setting could help differentiate patients into groups with similar underlying disease and can help optimise treatments for the individual patient. However, to get to this point, more research is needed to find underlying molecular pathologies for each cluster, and the efficacy of treatments has to be established for each cluster. In long COVID patients with pulmonary function abnormalities, pulmonary habilitation has been shown to increase lung function and quality of life, while also decreasing symptoms of fatigue and dyspnoea.33 That treatment might be of particular interest for patients in cluster 2, which showed lung function abnormalities. This study provides a start to use the information about the patients to research underlying pathways and with that knowledge select the best monitoring and/or treatment strategy for a personalised medicine approach in long COVID.

Data availability statement

Data are available upon reasonable request. The data are not publicly available due to agreements made by the consortium, that only allow access by each consortium partner to specific data that answers their prespecified research questions. A request for access to data by organisations outside of the consortium can be submitted to the P4O2 Data Committee (via and the research will need to be performed in collaboration with one of the P4O2 consortium partners.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and was approved by This study was approved by the ethical committee of the Amsterdam University Medical Center (NL74701.018.20). Participants gave informed consent to participate in the study before taking part.


The P4O2 study is a consortium effort and we wish to acknowledge the help and expertise of the individuals and groups who participated. This list can be found in online supplemental files. In addition, we would like to thank all patients who put their time and effort in to participate in the P4O2 study.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Collaborators Collaborators' author names are presented in online supplemental file 2.

  • Contributors JMB, MIA-A and AHM conceptualised this study. All authors were involved in creating the study design. NB, SB, IB, RJHCGB, LDB, MEBC, JJLJ, DG, LH, RJ, IvdL, PMAL, LCEN, EN, MAvdP, DWS, AMWJS, LST, BS, JJMG, JPvdB, EJMW, YvW participated in data collection and management. JMB, MIA-A and AHM were involved in the statistical analyses. MIA-A and AHM were involved in supervision. AHM acts as guarantor for the study. JMB wrote the first manuscript draft. All authors participated in reviewing and editing the manuscript and approved its submission.

  • Funding Partners in the Precision Medicine for more Oxygen (P4O2) consortium are the Amsterdam UMC, Leiden University Medical Center, Maastricht UMC+, Maastricht University, UMC Groningen, UMC Utrecht, Utrecht University, TNO, Aparito, Boehringer Ingelheim, Breathomix, Clear, Danone Nutricia Research, Fluidda, MonitAir, Ncardia, Ortec B.V., Philips, Proefdiervrij, Quantib-U, RespiQ, Roche, Smartfish, SODAQ, Thirona, TopMD, Lung Alliance Netherlands (LAN) and the Lung Foundation Netherlands (Longfonds). The consortium is additionally funded by the PPP Allowance made available by Health~Holland, Top Sector Life Sciences & Health (LSHM20104; LSHM20068), to stimulate public-private partnerships and by Novartis.

  • Competing interests MIA-A was funded by a full PhD scholarship from the Ministry of Higher Education of the Arab Republic of Egypt during the conduct of the study and received a grant from Stichting Astma Bestrijding. AHM received money paid to the institution from the ZonMW grant long COVID, the Stichting TAAI research grant, the EUROSTARS research grant COPDetect, an unrestricted research grant from Boehringer Ingelheim, an unrestricted research grant from the Vertex Innovation Award, a Dutch Lung Foundation grant, a Stichting Asthma Bestrijding grant and the Innovative Medicine Initiative (IMI) 3TR research grant. AHM received consulting fees from Astra Zeneca and Boehringer Ingelheim and received a honorarium for a lecture from GSK. AHM is the (unpaid) chair of the DSMB SOS BPD study, an advisory board member of the CHAMP study, president of the federation of innovative drug research in the Netherlands (FIGON) and president of the European Association of systems medicine (EASYM).

  • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.