Original ContributionInter-Rater, Intra-Rater, and Inter-Machine Reliability of Quantitative Ultrasound Measurements of the Patellar Tendon
Introduction
The use of musculoskeletal ultrasound for evaluating soft tissue structures is increasing rapidly both in the research and clinical settings, with advantages including high axial resolution, short time to conduct the test, real-time image capture, lack of ionizing radiation, wide availability, and relatively low cost. A wide number of ultrasound machines are available to researchers and clinicians, and as sonographic standards are established for the measurement of anatomical structures, it is critical that these measurements are comparable from one machine to another. In addition, ultrasound is frequently described as an operator-dependent imaging modality (Wakefield et al. 2005); therefore, ensuring the repeatability of measurements between sonographers is of high importance. Few studies have evaluated the inter-rater and intra-rater reliability of quantitative ultrasound-based measurements of the musculoskeletal system, and no studies have evaluated inter-machine reliability.
Issues regarding reliability and repeatability are not limited to musculoskeletal ultrasound, and the subject is of broad clinical significance in many applications of diagnostic ultrasound. Inter- and intra-rater reliability have been evaluated in other areas of ultrasound, both in the research and clinical settings. In the research setting, with the use of high-frequency transducers (40 MHz) and a strict imaging protocol, an extremely high level of inter-rater and intra-rater reliability is possible. For example, when measuring murine colon wall thickness, mean inter-rater and intra-rater differences of 0.03 and 0.06 mm, respectively, have recently been reported (Abdelrahman et al. 2012). In the clinical setting, protocols may be less strict, imaging environments are often less controlled, and available transducers tend to be lower in frequency, especially for deeper anatomic structures. These factors can preclude such negligible variation. A systematic review evaluating reliability in measurements of the abdominal aorta diameter revealed significant variability regarding reliability, with several studies showing inter-rater reliability within the clinically acceptable level of 5 mm and other studies showing reliability outside this range (Beales et al. 2011). Therefore, ongoing uncertainty regarding the ability to minimize inter-rater and intra-rater variability remains, especially in the clinical arena.
The present study was designed to address these issues in the setting of musculoskeletal ultrasound, specifically to evaluate the reliability of a validated method of measuring patellar tendon dimensions between two sonographers with different levels of experience, and between two different ultrasound machines. The patellar tendon was chosen because of its clinical importance in tendinopathy, patella infera and patella alta, as well as its relatively straightforward anatomic course.
Section snippets
Study sample
A convenience sample of 16 healthy subjects without a prior history of knee pain participated in the study. The subjects were medical residents at a university hospital and underwent ultrasound scanning of their dominant knee as part of an educational activity. Seven men and nine women, 25–36 years old, were included in the study. One subject was excluded when significant patellar tendinopathy was discovered during the scanning session. The two sonographers were medical residents with minimal
Results
The inter-rater, intra-rater, and inter-machine reliability is shown in Table 1. Inter-rater reliability in measuring both tendon CSA and tendon length was excellent, with ICC between 0.90 and 0.96. Intra-rater reliability for tendon CSA was also generally excellent, with ICC between 0.87 and 0.96. Intra-rater reliability was unable to be calculated for tendon length measurements, because only one length measurement per tendon was obtained by each investigator. Inter-machine reliability was
Discussion
The operator dependence of ultrasound as an imaging modality has been highlighted as potentially problematic for both qualitative and quantitative anatomic measurements (Wakefield et al. 2005). Two discrete tasks are required to obtain ultrasound measurements, both of which require an operator's input: physically obtaining the ultrasound images, and measuring or grading the structure of interest on the acquired images. This procedure is notably different from other modern imaging modalities
Conclusions
An experienced and novice sonographer attained high levels of inter-rater reliability when measuring the patellar tendon using strict scanning protocols. Inter-machine and intra-rater reliability were similarly excellent. To our knowledge, this study is the first to report inter-machine reliability in the setting of quantitative musculoskeletal ultrasound tendon measurements.
Acknowledgments
The authors thank Michael Z. Levy for his invaluable assistance with statistical support.
References (21)
- et al.
High-frequency ultrasound for in vivo measurement of colon wall thickness in mice
Ultrasound Med Biol
(2012) - et al.
A novel sonographic method of measuring patellar tendon length
Ultrasound Med Biol
(2012) - et al.
Ultrasound measurement of transversus abdominis during loaded, functional tasks in asymptomatic individuals: Rater reliability
PM R
(2011) - et al.
Sonographic evaluation of the size of Achilles tendon: The effect of exercise and dominance of the ankle
Ultrasound Med Biol
(2003) - et al.
Reproducibility of ultrasound measurement of the abdominal aorta
Br J Surg
(2011) - et al.
The histology of tendon attachments to bone in man
J. Anat
(1986) - et al.
Measuring agreement in method comparison studies
Statistical methods in medical research
(1999) - et al.
[Intra- and interrater variability of sonographic investigations of patella and achilles tendons]
Sportverletz Sportschaden
(2012) - et al.
Combined evaluation of influence of sonographer and machine type on the reliability of power Doppler ultrasonography for detecting, scoring and scanning synovitis in rheumatoid arthritis patients: results of an inter-machine reliability exercise
Ann Rheum Dis
(2008) - et al.
The OMERACT ultrasound task force – Advances and priorities
J. Rheumatol
(2009)
Cited by (52)
B-Mode Ultrasonography Is a Reliable and Valid Alternative to Magnetic Resonance Imaging for Measuring Patellar Tendon Cross-Sectional Area
2023, Ultrasound in Medicine and BiologyCitation Excerpt :In addition, US is an attractive alternative to assess tendon properties because of its affordability, time efficiency, portability and non-invasive nature. Despite the widespread use of US in musculoskeletal research, the reliability of US tendon measures is debated within the literature (Gellhorn and Carlson 2013; McAuliffe et al. 2017). For example, US measures of PT CSA have been reported to be reliable when measured on multiple days (Reeves and Narici 2003), by multiple operators with different experience, using multiple machines (Gellhorn and Carlson 2013).
Ultrasonographic assessment of patellar tendon thickness at 16 clinically relevant measurement sites – A study of intra- and interrater reliability
2019, Journal of Bodywork and Movement TherapiesCitation Excerpt :Precision for intrarater measurements varied from 0.04 cm to 0.13 cm (13.3%–38.7%) while ranging from 0.06 cm to 0.15 cm (19.1%–42.5%) for interrater measurements. Previous intrarater- and interrater USI studies on muscle- and tendon thickness reveal a cumulative ICC range from 0.64 to 0.97 (Bentman et al., 2010; Cheng et al., 2012; Costa et al., 2009; Craig et al., 2008; Gellhorn and Carlson, 2013; Koppenhaver et al., 2009; Liang et al., 2007; O'Sullivan et al., 2007; Rathleff et al., 2011; Skou and Aalkjaer, 2013; Wallwork et al., 2007) and from 0.40 to 0.97, respectively (Bentman et al., 2010; Cheng et al., 2012; Gellhorn and Carlson, 2013; O'Sullivan et al., 2007; Rathleff et al., 2011; Skou and Aalkjaer, 2013; Wallwork et al., 2007). Previous results on measurement precision (LOA-%) reveal a cumulative range from 1.8% to 53% for intrarater (Bentman et al., 2010; Bjordal et al., 2003; Costa et al., 2009; Koppenhaver et al., 2009; O'Connor et al., 2004; O'Sullivan et al., 2007; Rathleff et al., 2011; Skou and Aalkjaer, 2013; Springer et al., 2006; Wallwork et al., 2007; Ying et al., 2003) and 15.8%–49% for interrater (Bentman et al., 2010; O'Sullivan et al., 2007; Rathleff et al., 2011; Skou and Aalkjaer, 2013; Wallwork et al., 2007; Ying et al., 2003) for USI-derived measures of muscle- and tendon thickness.
Early anterior knee pain in male adolescent basketball players is related to body height and abnormal knee morphology
2018, Physical Therapy in SportDoes the ultrasound imaging predict lower limb tendinopathy in athletes: a systematic review
2023, BMC Medical Imaging