Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Specific fine-tuned GPT-enhanced medical imaging diagnosis recommendations
0
Zitationen
8
Autoren
2025
Jahr
Abstract
Overutilization of medical imaging is a significant problem in healthcare, contributing to wasted resources and potentially causing harm to patients. Despite educational efforts and tools, appropriate imaging adoption remains challenging. To address this, we aimed to train an AI model, termed the Appropriate Medical Imaging Recommendations Generative Pre-trained Transformer (AMIR-GPT), to provide precise recommendations for medical imaging, thereby advancing value-based healthcare. This prospective study used a dataset comprising 1036 paired questions and answers, collected from 26 guidelines in the American College of Radiology Appropriateness Criteria (ACR AC). The dataset, covering common clinical scenarios, was divided into a training set (932 entries) and a test set (104 entries). The OpenAI text-davinci model based on GPT-3 was fine-tuned in four iterations using the training set. The performance of AMIR-GPT was compared to GPT-4 and GPT-3.5 on the test set. Response similarity to standard answers was scored from 1 to 5 using a weighted Cohen’s kappa to measure inter-rater reliability between the model-generated responses and expert reviewers. Statistical significance was assessed using a chi-square test to compare categorical performance metrics across the models. AMIR-GPT achieved the highest perfect score rate (33.33%), outperforming GPT-4, Gemini, and GPT-3.5. In the high match category, GPT-3.5 led with 25%, while Gemini excelled in the medium match category at 37.5%. ANOVA confirmed significant differences among models (F = 6.49, P = 0.0004). Notable pairwise results included significant differences between AMIR-GPT and GPT-3.5 (P = 0.018) and between GPT-3.5 and Gemini (P = 0.000), indicating varied model performance. Fine-tuning GPT models for specific medical domains enhances their ability to provide accurate imaging recommendations. However, further validation is needed to confirm the broader applicability of these findings in various clinical settings.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.102 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.468 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.