Study compares diagnostic accuracy of ChatGPT & radiologists in musculoskeletal imaging

The study involved 106 musculoskeletal radiology cases, including patient medical histories, images, and imaging findings.

New Delhi: In the evolving field of radiology, where specialised knowledge is essential to interpret diagnostic imaging for various diseases, recent advancements in generative AI models like Chat Generative Pre-trained Transformer (ChatGPT) have shown potential as diagnostic tools.

However, their accuracy requires thorough evaluation for optimal future use.

Dr. Daisuke Horiuchi and Associate Professor Daiju Ueda from Osaka Metropolitan University’s Graduate School of Medicine spearheaded a research team to compare the diagnostic accuracy of ChatGPT with that of radiologists.

The study involved 106 musculoskeletal radiology cases, including patient medical histories, images, and imaging findings.

For the study, case information was input into two versions of the AI model, GPT-4 and GPT-4 with vision (GPT-4V), to generate diagnoses. The same cases were presented to a radiology resident and a board-certified radiologist, who were tasked with determining the diagnoses.

The results revealed that GPT-4 outperformed GPT-4V and matched the diagnostic accuracy of radiology residents. However, ChatGPT’s diagnostic accuracy was found to be subpar compared to that of board-certified radiologists.

Dr. Horiuchi commented on the findings, saying: “While the results of this study indicate that ChatGPT may be useful for diagnostic imaging, its accuracy cannot compare to a board-certified radiologist. Additionally, this study suggests that its performance as a diagnostic tool must be fully understood before it can be used.”

He also emphasised the rapid advancements in generative AI, noting the expectation that it could become an auxiliary tool in diagnostic imaging in the near future.

The study’s findings were published in the journal European Radiology, highlighting the potential and limitations of generative AI in medical diagnostics, and underscoring the need for further research before widespread clinical adoption although it serves the purpose well in this rapidly burgeoning technological age.

Back to top button