Please use this identifier to cite or link to this item: doi:10.22028/D291-43314
Title: Artificial intelligence in commercial fracture detection products: a systematic review and meta-analysis of diagnostic test accuracy
Author(s): Husarek, Julius
Hess, Silvan
Razaeian, Sam
Ruder, Thomas D.
Sehmisch, Stephan
Müller, Martin
Liodakis, Emmanouil
Language: English
Title: Scientific Reports
Volume: 14
Issue: 1
Publisher/Platform: Springer Nature
Year of Publication: 2024
Free key words: Bone imaging
Fracture repair
DDC notations: 610 Medicine and health
Publikation type: Journal Article
Abstract: Conventional radiography (CR) is primarily utilized for fracture diagnosis. Artificial intelligence (AI) for CR is a rapidly growing field aimed at enhancing efficiency and increasing diagnostic accuracy. However, the diagnostic performance of commercially available AI fracture detection solutions (CAAI-FDS) for CR in various anatomical regions, their synergy with human assessment, as well as the influence of industry funding on reported accuracy are unknown. Peer-reviewed diagnostic test accuracy (DTA) studies were identified through a systematic review on Pubmed and Embase. Diagnostic performance measures were extracted especially for different subgroups such as product, type of rater (stand-alone AI, human unaided, human aided), funding, and anatomical region. Pooled measures were obtained with a bivariate random effects model. The impact of rater was evaluated with comparative meta-analysis. Seventeen DTA studies of seven CAAI-FDS analyzing 38,978 x-rays with 8,150 fractures were included. Stand-alone AI studies (n=15) evaluated five CAAI-FDS; four with good sensitivities (>90%) and moderate specificities (80–90%) and one with very poor sensitivity (<60%) and excellent specificity (>95%). Pooled sensitivities were good to excellent, and specificities were moderate to good in all anatomical regions (n=7) apart from ribs (n=4; poor sensitivity / moderate specificity) and spine (n=4; excellent sensitivity / poor specificity). Funded studies (n=4) had higher sensitivity (+5%) and lower specificity (-4%) than non-funded studies (n=11). Sensitivity did not differ significantly between stand-alone AI and human AI aided ratings (p=0.316) but specificity was significantly higher the latter group (p<0.001). Sensitivity was significant lower in human unaided compared to human AI aided respectively stand-alone AI ratings (both p≤0.001); specificity was higher in human unaided ratings compared to stand-alone AI (p<0.001) and showed no significant differences AI aided ratings (p=0.316). The study demonstrates good diagnostic accuracy across most CAAI-FDS and anatomical regions, with the highest performance achieved when used in conjunction with human assessment. Diagnostic accuracy appears lower for spine and rib fractures. The impact of industry funding on reported performance is small.
DOI of the first publication: 10.1038/s41598-024-73058-8
URL of the first publication: https://www.nature.com/articles/s41598-024-73058-8
Link to this record: urn:nbn:de:bsz:291--ds-433140
hdl:20.500.11880/38837
http://dx.doi.org/10.22028/D291-43314
ISSN: 2045-2322
Date of registration: 29-Oct-2024
Description of the related object: Supplementary Information
Related object: https://static-content.springer.com/esm/art%3A10.1038%2Fs41598-024-73058-8/MediaObjects/41598_2024_73058_MOESM1_ESM.docx
https://static-content.springer.com/esm/art%3A10.1038%2Fs41598-024-73058-8/MediaObjects/41598_2024_73058_MOESM2_ESM.docx
Faculty: M - Medizinische Fakultät
Department: M - Chirurgie
Professorship: M - Keiner Professur zugeordnet
Collections:SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:
File Description SizeFormat 
s41598-024-73058-8.pdf4,21 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons