Discussion
Sequentially collected conventional auscultation classifications and digitally recorded and remotely classified lung auscultation classifications have fair-to-moderate concordance when evaluating crackles and wheeze lung sounds among pneumonia cases with concurrent conventional and digital auscultation. Conventional and digital auscultation may result in different classification patterns, with a higher proportion with crackles on conventional auscultation and a higher proportion with wheeze on digital auscultation. In an expanded sample of pneumonia cases with both concurrent and non-concurrent paired conventional and digital auscultation, patient-level characteristics did not predict concordance. Presence of crackles was generally predictive of greater clinical severity among pneumonia cases, and wheeze was associated with decreased clinical severity.
Despite having been an established and widely used diagnostic tool for centuries, the accuracy and reliability of chest auscultation for pneumonia diagnosis has been questioned, even when using near-simultaneous auscultation with identical equipment. Foundationally, there is no readily available gold standard to assess auscultation accuracy. There are uncertainties regarding variation within lung sounds between breaths of different volume, temporal changes between breaths, intra-provider variability over time and inter-provider variability at the same time point with the same patient.21 These uncertainties may be exacerbated in the examination of young children. In a study of Norwegian adults, kappa agreement between providers was κ=0.43 for inspiratory wheezes, κ=0.56 on expiratory wheezes, κ=0.46 on inspiratory crackles and κ=0.20 for expiratory crackles.22 In a prospective study of 102 infants, Elphick et al23 reported κ=0.07 for wheeze and κ=0.36 for crackles between two experienced clinicians.23 Melbye et al24 found comparatively lower agreement among paediatric recordings compared with adult recordings.
Nearly all studies of agreement have been conducted in controlled environments in high-income country settings where the clinical environment is typically quieter than many low-income and middle-income country settings. For example, comparing agreement between conventional and digital auscultation, Kevat et al25 compared intra-listener (within one provider) concordance from children in a tertiary paediatric facility in Melbourne, Australia, and found moderate concordance for wheeze (κ=0.44 and 0.55) and near-perfect concordance for crackles. Digital auscultatory recording and remote classification presents challenges due to the inability to visually observe the patient, including inspiratory and expiratory phases, clinical picture, and may have external noise, especially in many busy low-income and middle-income settings where PERCH was conducted. Despite these challenges, the concordance levels in our study demonstrate that digitally recorded and remotely classified lung auscultation can achieve results similar to inter-provider concordance using identical equipment in ideal settings.
Patterns of auscultation classifications were different between conventional and digital auscultation, with digital auscultation classifications having a greater proportion of wheeze and a lower proportion of crackles. Acute bronchiolitis is often caused by viruses and is associated with wheeze.26 In our likely acute viral infection group, wheeze-only classifications were significantly more common using digital auscultation (online supplemental table S2). Kevat et al25 reported better sensitivity for detecting wheeze using digital stethoscopes compared with conventional stethoscopes. Conventional bell and diaphragm stethoscopes may attenuate higher frequency sounds such as wheeze, whereas digital stethoscopes can capture sounds across the full range of audible sound frequencies.
Sensitive detection of wheeze may be an informative diagnostic feature. Wheeze on digital auscultation was associated with both lower mortality (compared with crackles) and lower odds of having very severe pneumonia (compared with other sound classifications). Children without crackles nor wheeze may be a mix of children without severe lung involvement, or alternatively may be very severe cases with low lung function and volume and unable to generate crackles or wheeze sounds. We previously reported that wheeze on digital auscultation was associated with a lower odds of radiographic pneumonia compared with children without crackles nor wheeze among children with severe pneumonia but no WHO danger signs.27 Future research may explore whether this common but less-severe case group may benefit if digital auscultation adds differential diagnostic capacity with regards to severity or aetiology to help guide appropriate triage and antibiotic prescribing.27
Crackles were detected less frequently on digital auscultation compared with conventional auscultation in our study. Crackles were associated with abnormal chest radiography using both digital and conventional auscultation, and were found more frequently in children with high CRP and likely pneumococcal pneumonia. Decreased sensitivity for crackles on digital auscultation may be caused by difficulties differentiating artefacts such as stethoscope movement from true lung sounds, especially from a remote recording. However, there were consistently high rates of crackles in all groups for conventional auscultation, including among children likely to have an acute viral infection where crackles may not be as frequently expected (online supplemental table S2), suggesting the potential for false positives on conventional auscultation. Using digital auscultation, crackles were highest in the group most often associated with crackles (pneumococcal pneumonia), less common in likely acute viral infection groups, and rare among controls. Although these patterns may suggest that digital auscultation results in fewer false positives for crackles, without a gold standard measurement, it cannot be ruled out that digital auscultation may be less sensitive for crackles. Nonetheless, presence of crackles-only on digital auscultation may help identify children with higher risk of severe disease and mortality.
There was heterogeneity between the sites in terms of patient and epidemiological characteristics, and with regards to provider level and established training practices on conventional auscultation within and between sites. However, providers conducting conventional auscultation were generally experienced doctors, clinical officers or nurses who regularly conducted clinical assessments for children with pneumonia at each site. Further, there were no significant demographic or clinical predictors of concordance other than The Gambia site being associated with better concordance. The consistency of classifications across several sites with varied severity characteristics suggests that findings may generalise across a wide range of settings.
This evaluation had several limitations. There was no gold standard when comparing conventional and digital auscultation classifications, so there are inferential limits when comparing differences in findings. We are unable to fully evaluate the contribution of multiple sources of variation, including equipment, timing and inter-rater differences. As previously reported, PERCH conducted clinical standardisation trainings and assessments before and during the study.28 The course included a brief conventional auscultation training which may have reduced inter-rater differences and provided greater consistency in auscultatory concordance over time and between sites (online supplemental table S5). While the digital auscultation listening panel received a different auscultatory training as part of the listening panel standardisation process, concordance between digital and conventional auscultation may have been improved in general by participation in auscultation training sessions. There was often a time difference between conventional and digital auscultation. While we had an exact time for the digital auscultation recording, our best estimate of the conventional auscultation was the start of the clinical assessment, which could take over an hour. To include all near-simultaneous conventional auscultations and digital recordings, we allowed for a window of 2 hours for concordance evaluations. Longer durations between conventional and digital auscultation were primarily due to availability of staff trained on the digital auscultation process. Nonetheless, concordance was similar when comparing recordings within 2 hours to those recorded within 24 hours (online supplemental table S3). A digital recording review panel may not be available in real-world settings. However, concordance using classifications from a single initial reviewer on the digital auscultation panel was similar to concordance using the panel (online supplemental table S6); the feasibility of having one remote listener is realistic for telemedicine. Alternatively, algorithms may be developed and integrated into digital auscultation systems that provide point-of-care diagnostic information without the need for clinician interpretation. Automated systems could be developed to help identify children at higher risk of severe disease (crackles-only), or conversely, children with wheeze who may benefit from supportive care without antimicrobial therapy.
Conventional and digital auscultation have moderate concordance and are clinically informative; both demonstrate an association between wheeze and decreased clinical severity. Digital stethoscopes may offer value in research where inter-provider variability can be reduced, and in telemedicine, particularly in low-resource settings where the burden of disease is greatest and where trained auscultation may not be available. As viral disease contributes increasingly to paediatric pneumonia, further studies may inform how detection of wheeze on digital auscultation can contribute to case management and offer opportunities for reducing unnecessary antimicrobial use.