Article Text

Incidence and prevalence of interstitial lung diseases worldwide: a systematic literature review
  1. Rikisha Shah Gupta1,2,
  2. Ardita Koteci3,4,
  3. Ann Morgan3,4,
  4. Peter M George5 and
  5. Jennifer K Quint1,3
  1. 1National Heart and Lung Institute, Imperial College London, London, UK
  2. 2Real-World Evidence, Gilead Sciences, Foster City, CA, USA
  3. 3Imperial College London, London, UK
  4. 4NIHR Imperial Biomedical Research Centre, London, UK
  5. 5Royal Brompton and Harefield NHS Foundation Trust, London, UK
  1. Correspondence to Rikisha Shah Gupta; r.shah20{at}


Interstitial lung disease (ILD) is a collective term representing a diverse group of pulmonary fibrotic and inflammatory conditions. Due to the diversity of ILD conditions, paucity of guidance and updates to diagnostic criteria over time, it has been challenging to precisely determine ILD incidence and prevalence. This systematic review provides a synthesis of published data at a global level and highlights gaps in the current knowledge base. Medline and Embase databases were searched systematically for studies reporting incidence and prevalence of various ILDs. Randomised controlled trials, case reports and conference abstracts were excluded. 80 studies were included, the most described subgroup was autoimmune-related ILD, and the most studied conditions were rheumatoid arthritis (RA)-associated ILD, systemic sclerosis associated (SSc) ILD and idiopathic pulmonary fibrosis (IPF). The prevalence of IPF was mostly established using healthcare datasets, whereas the prevalence of autoimmune ILD tended to be reported in smaller autoimmune cohorts. The prevalence of IPF ranged from 7 to 1650 per 100 000 persons. Prevalence of SSc ILD and RA ILD ranged from 26.1% to 88.1% and 0.6% to 63.7%, respectively. Significant heterogeneity was observed in the reported incidence of various ILD subtypes. This review demonstrates the challenges in establishing trends over time across regions and highlights a need to standardise ILD diagnostic criteria.PROSPERO registration number: CRD42020203035.

  • Asbestos Induced Lung Disease
  • Clinical Epidemiology
  • Interstitial Fibrosis
  • Systemic disease and lungs

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Interstitial lung disease (ILD) is a collective term representing a diverse group of lung conditions characterised by the presence of non-infective infiltrates, most commonly in the pulmonary interstitium and alveoli, which in certain cases manifest as architectural distortion and irreversible fibrosis. These conditions vary in their aetiology, clinical pathways, severity and prognosis.1 Some conditions resolve completely without pharmacological intervention, whereas others, such as idiopathic pulmonary fibrosis (IPF) and non-IPF progressive fibrosing (PF) ILDs, inexorably progress to respiratory failure and premature mortality despite treatment.

Given its universally progressive nature and poor prognosis, IPF has attracted the most research attention and the current literature suggests a wide variation in disease distribution across Europe and USA. IPF prevalence varies between 0.63 and 7.6 per 100 000 persons in the USA and Europe2 3 with a sharp increase with age.

More recently, there have been several studies investigating the incidence and prevalence of non-IPF ILDs, mainly autoimmune ILDs. Most of these reviews included studies drawn from single centres. Epidemiological data for non-IPF ILDs is inconsistent which makes it challenging to fully appreciate the ILD landscape. A recent review reported the prevalence of ILD in myositis conditions ranged from 23% in America to 50% in Asia.4 Sambataro et al5 reported about 20% of primary Sjogren’s syndrome patients were diagnosed with ILD. Additionally, there have been a few studies evaluating the incidence of drug induced ILD (DILD).6–8 Guo et al9 reported ILD incidence ranged from 4.6 to 31.5 per 100 000 persons in Europe and North America. A recent study using Global Burden of Disease data indicated the global ILD incidence in the past 10 years has risen by 51% (313.2 cases in 1990 to 207.2 per 1 00 000 cases in 2019).10 These published estimates highlight a discernible variation in the ILD epidemiology across countries. It is unclear whether this is an ‘actual’ difference in the numbers across regions or whether the heterogeneity is driven by lack of guidelines and inconsistencies in ILD diagnostic pathways and standards of care. Likewise, while evidence suggests that the incidence of ILD has been rising over time,9 whether this increase reflects a true increase in the disease burden, possibly related to an ageing population or whether this is due to improvements in detection, increased availability of cross-sectional imaging or coding practices over time is unknown.

This systematic review appraises the published literature on the incidence and prevalence of various ILDs over the last 6 years. We aimed to provide a comprehensive understanding of global incidence and prevalence. Specifically, we sought to identify areas where data are robust, to better appreciate the burden of ILD conditions and to comprehend the implications on healthcare utilisation and resources. We also set out to highlight areas where there remains a need for further study.


Study registration

This protocol has been drafted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols guidelines11 and registered with the International Prospective Register of Systematic Reviews, PROSPERO (CRD42020203035). Please refer to the online supplemental material for the full study protocol.

Search strategy and selection criteria

A systematic search of Medline and Embase was carried out in September 2021 to identify relevant studies investigating the incidence and prevalence of various ILDs. The search criteria were developed with support of librarian (online supplemental figure E1). Due to the high volume of papers, we restricted this study period to papers published in the past 6 years. This search was limited to human studies written in English that were published between 2015 and 2021. The full search strategy and data sources included are described in online supplemental material.

Study population

Inclusion criteria included observational studies reporting the incidence and/or prevalence of individual ILDs, with study participants aged over 18 years old. Randomised controlled trials, case reports, reviews and conference abstracts were excluded. Studies which referred to DILD only were excluded because (1) there were many abstracts reporting on DILD, therefore this could be a standalone review and (2) epidemiology of DILD was a subject of a recent systematic review.12 The first author (RG) screened all records by title and abstract; to begin with, the second reviewer (AK) independently screened 10% of all records. If there was a disagreement between RG and AK, an additional 15% were screened by AK. All studies identified as eligible for full text review were reviewed by RG, with AK reviewing 50% of eligible studies. Any disagreement was resolved through discussion with other authors, including an ILD expert. Reference of included studies were searched for additional literature.

Following full text review, RG carried out data extraction for eligible studies. AK independently extracted data for 25% of studies using the same template. RG assessed the quality for all included studies, reporting incidence and/or prevalence using a modified Newcastle Ottawa Scale (NOS). There were two NOS modified scales, one each for studies reporting prevalence and/incidence. AK independently assessed the quality of 25% of included studies. If there was a discrepancy between the data extraction and/or quality assessment conducted by RG and AK, then additional 15% were extracted and/or reviewed by AK.

It was noted that for IPF, many authors adopted what they termed ‘broad’ and ‘narrow’ case definitions. For example, Raghu et al2 defined patients with International Classification of Disease, Ninth Revision (ICD-9) code 516.3 as a broadly defined case of IPF, and those who had this ICD-9 code alongside a claim for a surgical lung biopsy, transbronchial lung biopsy, or CT thorax as a narrowly defined case. We summarised the data using various reported case definitions. If multiple estimates were reported in a study, only the most recent estimate was included in this review.

There were two common themes around the reporting of prevalence. Studies drawn from the general population (reported prevalence per 100 000 persons) and studies drawn from multicentre or single centres (reported prevalence as the proportion of patients with ILD in the study cohort).

For this review, we have classified ILDs based on aetiology, grouped by conditions linked to environmental or occupational exposures, conditions typified by granulomatous inflammation, autoimmune ILDs and ILDs with no known cause (online supplemental figure E2).1

Evidence synthesis

The initial plan for this review was to conduct meta-analysis. However, due to high heterogeneity, we were unable to meta-analyse. Therefore, we have proceeded with data synthesis across the ILD subgroups.


Total number of included studies

The literature search yielded a total of 12 924 studies, of which 80 were included in this review. Online supplemental figure E3 demonstrates the selection process for all studies and highlights reasons for exclusion at each stage.

Although 80 unique publications were included, some papers explored the epidemiology of more than one ILD, the total count of reported estimates is 88. Half of the included publications explored autoimmune-related ILDs (n=44/88)(online supplemental figure E4).

Geographically, ILD publications represented all major world regions, but were predominantly from Asia (n=30, 34.1%) and Europe (n=23, 26.1%) (figure 1).

Figure 1

Geographical distribution of publications included.

Studies reporting prevalence

Eight studies reported the prevalence of IPF in general population. Prevalence of IPF was commonly reported applying ‘primary’, ‘broad’, ‘intermediate’ and/or ‘narrow’ case definitions. In the general population, the prevalence of IPF ranged from 7 to 1650 per 100 000 persons (table 1). When explored within various case definitions, the prevalence for ‘broad’ cases ranged from 11 (USA, 2010)2 to 1160 (USA, 2021)16; for ‘narrow’ cases, this ranged from 7 (USA, 2010)2 to 725 (USA, 2019).16 There was only one study that reported IPF prevalence of 8.6% using a multicentre study setting.19

Table 1

Studies reporting IPF prevalence per 100 000 persons by various case definitions

Twelve studies reported estimates for non-IPF ILDs in the general population (online supplemental figure E5), with most of these conducted in the USA. The prevalence of systemic sclerosis (SSc) ILD in the general population ranged from 2.3 (Canada, 2018)20 to 19 (USA, 2017)21 per 100 000 persons. The highest SSc-ILD prevalence was reported in Medicare data which included patients aged 65 years and above.21 22 For rheumatoid arthritis (RA) ILD, prevalence in an RA Medicare cohort was 2%.23

Forty-six studies reported the prevalence of autoimmune-related ILD in cohorts of patients with an autoimmune condition or occupational ILD in workers with specific exposures. These studies primarily reported prevalence as a proportion, with the denominator representing patients with an autoimmune disorder or people working at a factory with exposure to certain agents, such as silica or asbestosis (figure 2). Most of these estimates were drawn from cohorts at single or multiple tertiary centres, disease registries or a factory in the case of occupational ILD. Significant heterogeneity was noted in the reported prevalence of ILD associated with SSc, RA and Sjogren’s (figure 2). The prevalence of ILD in SSc ranged from 26.1% (Australia, 2015)36 to 88.1% (India, 2013).44 Similarly, Sjogren’s ILD ranged from 1% (Sweden, 2011)55 to 87.8% (Saudi Arabia, 2021).56 In addition to dissimilarities in the prevalence across various regions, we also observed variation within region-specific estimates. For example, the 4 studies47 50–52 which reported Sjogren’s ILD prevalence within China, estimated a 4-fold variation in magnitude (18.6% in 201147 to 78.6% in 2014).52 Likewise, for RA ILD, there was substantial variation in the reported prevalence in Egypt (0.8% vs 63.7%).31 32 Among the occupational-related ILDs (figure 2), silicosis was the most explored condition (n=8)). Among these eight studies, there was a considerable variation in the reported prevalence of silicosis. Souza et al61 reported an approximately 7-fold higher estimate of silicosis prevalence than that reported by Siribaddana et al (37% vs 5.6%, respectively).65

Figure 2

Studies reporting non-IPF prevalence as percentage of study population. DM, dermatomyositis; HP, hypersensitivity pneumonitis; IIP, idiopathic interstitial pneumonia; ILD, interstitial lung disease; LAM, lymphangioleiomyomatosis; MCTD, mixed connective tissue disorder; multiC, multicentre; PLCH, pulmonary langerhans cell histiocytosis; PM, polymyositis; RA, rheumatoid arthritis; reg, registry; single, single centre; SSc, systemic sclerosis. Details on the study population, sample size and ILD diagnosis methods are summarised in online supplemental tables E1–E31.

Studies reporting incidence

Significant discrepancies were observed in reported ILD incidence across subgroups and individual conditions, mainly due to differences in the study setting. Depending on the study setting and type of data source used, some authors reported an incidence rate (per 100 000 person-years), while others reported incidence proportion. Table 2 lists IPF incidence by case classification and country, and figure 3 provides a list of studies reporting incidence of non-IPF ILDs.

Table 2

Published estimates of IPF incidence, stratified by various case definitions

Figure 3

Studies reporting ILD incidence, grouped by ILD subgroups. ICD-9-CM, International Classification of Disease, Ninth Revision, Clinical Modification; ILD, interstitial lung disease; py, person-years; RA, rheumatoid arthritis; SSc, systemic sclerosis. Ɨ Narrow silicosis definition used: Medicare beneficiaries with any claim that included ICD-9-CM code 502, pneumoconiosis due to other silica or silicates, listed in any position during 1999–2014, with at least one inpatient, skilled nursing or home health agency claim, or at least two outpatient provider claims within 365 days of each other and cases with a chest X-ray or CT scan 30 days before or 30 days after a silicosis claim. Details on the study population, sample size and ILD diagnosis methods are summarised in online supplemental tables E1–E31.


In this review, we synthesised the evidence for the incidence and prevalence of ILDs from studies published between 2015 and 2021. Considering the changing ILD nomenclature and the desire to reflect more current estimates, in this review, we decided to restrict the study period to past 6 years. We took this conscious effort with the aim to limit the heterogeneity across reported estimates. We evaluated 39 incidence and 78 prevalence estimates for individual ILD disorders that were distributed globally. We noted an increase in the number of studies investigating non-IPF ILDs and more specifically autoimmune ILDs in recent years. There was a 6-fold rise in the autoimmune ILDs studies, in 2021 when compared with 2015 (18 vs 3 studies, respectively). This increase in non-IPF ILD studies may be related to the emergence of antifibrotic therapies for non-IPF fibrosing lung diseases.91–93 Interestingly, the publication trend for IPF has remained unchanged.

This review revealed considerable inconsistencies in the incidence and prevalence estimated of the main ILD subgroups. The reported prevalence of IPF ranged from 7 to 1650 per 100 000 persons,2 16 an approximately 800-fold difference across case definitions, despite most studies reporting IPF prevalence in the general population. The incidence and prevalence estimates reported by Zhang et al16 were a notable outlier; this study was based on the USA veterans’ healthcare database which included mostly White patients aged over 70 years—the demographic in which IPF is most common. Aside from this study, the majority of studies reported a prevalence of IPF ranging from 7 to 42 per 100 000 persons across different case definitions.2 17

Unlike prevalence, we found considerable inconsistencies in how the incidence of IPF is reported. An important factor is the lack of uniformity in reporting units. Half of the studies reported incidence using person-years, whereas others reported per 100 000 person-years. We were, therefore, unable to compare incidence estimates in a similar fashion to prevalence. It is also important to note that changes in diagnostic guidelines for IPF over the years may have made it more challenging to accurately estimate its burden and temporal trends.94–96

For non-IPF subgroups, such as autoimmune ILDs, there were wide variations in prevalence estimates between countries and within different healthcare settings in the same country. Overall, the variation in prevalence and incidence estimates was even greater for non-IPF ILDs than IPF. This can be attributed to several factors. First, in clinical practice, it is common for the clinical presentation and serological autoantibody profiles to result in overlap syndromes. Autoimmune conditions can coexist and patients with occupational ILDs may also have autoimmune conditions. Such fluidity of diagnoses at a clinical level reflects the challenges in estimating non-IPF ILDs. Second, the denominator more frequently differs for non-IPF ILDs, resulting in lack of standardised reporting. Unlike IPF, for which there are published validated algorithms to identify ‘true’ cases in the general population.18 24 97 For non-IPF ILDs, studies relied on disease registries or were conducted at single/multispecialist clinics.

Majority of the autoimmune-related ILD estimates were in RA and SSc ILD. When assessing SSc ILD prevalence, we observed a wide range (26.1% to 88.1%)37 44 in reported estimates, but when studies were dichotomised into single-centre studies and multicentre studies, it became clear that the highest variability was contributed by single centre studies (SSc prevalence, 31.2%–88.1%).43–46 Owing to a smaller number of studies reporting incidence, we were unable to observe whether the same challenge existed.

The prevalence of silicosis ranged from 5.6%65 to 37%61 in workers exposed to silica. Occupational ILD studies were conducted at a factory, in a neighbourhood with proximity to industries, a registry or multicentre settings. Therefore, lack of generalisability and applicability of findings only to certain populations contributed largely to the wide variabilities of these reported estimates. The geographical distribution of occupational ILD papers alludes to dominance of exposure related ILDs in low-income and middle-income countries in Asia and South America (42.8% were in Asia).

While historical diagnostic classification has been founded on underlying aetiology or clinical pathways, there is now a growing emphasis on disease behaviour.98 99 Attention has focused on a subgroup of ILD patients who go on to develop a PF phenotype. IPF is the archetypal PF ILD but other ILDs such as chronic hypersensitivity pneumonitis (HP), SSc ILD can exhibit ‘IPF-like’ behaviour, including rapid decline in lung function and early mortality.100 The epidemiology of PF ILD is particularly challenging to examine as accepted guidelines on definition and diagnosis have yet to be published The reported prevalence of PF ILDs (per 100 000 persons) was 19.4 in France and 57.8 in the USA.88 89 The future direction of research will likely focus on PF ILD as a phenotype which transcends previously adhered-to diagnostic labels and is associated with poorer outcomes and increased mortality.100 101

Among the 39 studies reporting ILD incidence (online supplemental figure E6), most studies were categorised as medium risk (n=25/39, 64.1%). Two studies were categorised as high-risk primarily because of lack of information on ILD diagnosis and poor quality of reporting estimates (ie, descriptive statistics were not reported, were incomplete or did not include proper measures of dispersion).

Similarly, there were 78 prevalence assessments (online supplemental figure E7) of which approximately 18% (n=14/78) were categorised as high risk, 64.1% (n=50/78) as medium risk and 18% (n=14/76) as low risk. Most studies assessed as high risk were studies reporting autoimmune ILDs, mainly because of ILD diagnosis, single-centre studies or small sample size. Most of the studies reporting prevalence based on large healthcare datasets or disease registries were classified as low risk.

There are several strengths of this systematic review. We have provided an assessment of the incidence and prevalence of several ILD conditions globally and have grouped ILDs based on their aetiology to allow the appraisal of incidence and/prevalence at a disease level with as much granularity as possible. This review underlines the need for standardisation of diagnostic classifications for non-IPF ILDs—the narrower estimates for IPF provide the evidence that clear and consistent diagnostic guidelines are of great clinical utility. Guidelines have recently emerged for the diagnosis of HP102 103 which we envisage will further improve the epidemiological reporting of this important condition, although incorporation of guidelines into routine clinical practice and then into epidemiological estimates takes time. Cross-specialty guideline groups will undoubtedly improve standardisation of reporting for autoimmune driven ILDs.

It is possible that genetic differences between individuals from different ethnic backgrounds may play a role in the global variability in incidence and prevalence. For example, the MUC5B promoter polymorphism (rs35705950) is the dominant risk factor for IPF104 and is also a key risk factor for other ILDs such as RA.105 This gain of function polymorphism is frequent in those of European decent but almost completely absent in those of African ancestry.106 As more research is performed unravelling the complex interplay between genetics and environment in the development of ILD, it is likely that genetic variability will be found to play an important role in the global variability of ILD.

Despite the strengths, there are limitations to this systematic review. The certainty of the ILD case definition varied across studies. It was not always possible to be sure of how reliable the ascertainment method was. However, we attempted to reflect the differences in the ILD diagnostic methods in our risk of bias quality assessment. Along with the uncertainty in the diagnosis of ILD, there were different disease definitions used across studies. Therefore, in this review due to high heterogeneity, in how ILD was defined, we were unable to perform a meta-analysis. In this review, we have only included studies reporting ILD estimates in general populations, registries or populations with a specific disorder of interest. For single-centre studies reporting incidence and/or prevalence of autoimmune or exposure ILDs, the estimates were not generalisable and this has been reflected in the risk of bias quality assessment score. This review is limited to English publications only. However, due to high volume of papers found with the study period, we are confident it has a minimal effect on the overall conclusion.107


This review highlights the lack of uniformity in the published estimates of incidence and prevalence of ILD conditions. In addition, there is a dissimilarity in disease definitions across the studies and geographical regions. Owing to these discrepancies, we were unable to derive estimates for the global incidence and prevalence of ILD and moreover unable to confirm whether there has been a ‘true’ increase in ILD incidence over time. Revisions to diagnostic criteria have augmented the challenges of estimating incidence and prevalence of individual ILD conditions and determining the drivers for temporal trends in incidence. Improving our estimates of the burden of fibrosing lung conditions is essential for future health service planning, a need that has been heightened by the development of new antifibrotic treatments. Guidelines have recently emerged for non-IPF ILDs, we envisage this may improve the epidemiological reporting for future research. There is a fundamental need to standardise ILD diagnosis, disease definitions and reporting in order to provide the data which will drive the provision of a consistently high level of care for these patients across the globe.108

Ethics statements

Patient consent for publication


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter @DrPeter_George

  • Contributors RG, AM, PMG and JKQ developed the research question. RG, AM, PMG and JKQ developed the study protocol. RG developed the search strategy with input from AM and JKQ. RG screened the studies for inclusion, extracted the data from included studies and carried out quality assessment of the data. AK was the secondary reviewer for screening, data extraction and quality assessment. PMG supported with the understanding of various ILD diseases and their clinical pathways. All authors interpreted the review results. RG drafted the manuscript. All authors read, commented on and approved the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Map disclaimer The inclusion of any map (including the depiction of any boundaries therein), or of any geographic or locational reference, does not imply the expression of any opinion whatsoever on the part of BMJ concerning the legal status of any country, territory, jurisdiction or area or of its authorities. Any such expression remains solely that of the relevant source and is not endorsed by BMJ. Maps are provided without any warranty of any kind, either express or implied.

  • Competing interests RG is a current employee of Gilead Sciences, outside the submitted work. JKQ has received grants from The Health Foundation, MRC, GSK, Bayer, BI, British Lung Foundation, IQVIA, Chiesi AZ, Insmed and Asthma UK. JKQ has received personal fees for advisory board participation or speaking fees from GlaxoSmithKline, Boehringer Ingelheim, AstraZeneca, Bayer and Insmed. PMG has received grants from the MRC, Boehringer Ingelheim and Roche Pharmaceuticals and personal fees from Boehringer Ingelheim, Roche Pharmaceuticals, Teva, Cippla, AZ and Brainomix. AK and AM have nothing to disclose.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.