Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Abstract 16401: Optimizing ChatGPT to Detect VT Recurrence From Complex Medical Notes
3
Zitationen
13
Autoren
2023
Jahr
Abstract
Introduction: Large language models (LLMs), such as ChatGPT, have remarkable ability to interpret natural language using text questions (prompts) applied to gigabytes of data in the world wide web. However, the performance of ChatGPT is less impressive when addressing nuanced questions from finite repositories of lengthy, unstructured clinical notes (Fig A). Hypothesis: The performance of ChatGPT to identify sustained ventricular tachycardia (VT) or fibrillation after ablation from free-text medical notes is improved by optimizing the question and adding in-context sample notes with correct responses (‘prompt engineering’). Methods: We curated a dataset of N = 125 patients with implantable defibrillators (32.0% female, LVEF 48.9±13.9%, 61.7±14.0 years), split into development (N = 75) and testing (N = 50) sets of 307 and 337 notes, with 256.8±95.1 and 289.8±103 words, respectively. Notes were deidentified. Gold standard labels for recurrent VT (Yes, No, Unknown) were provided by experts. We applied GPT-3.5 to the test set (N=337 notes), using 1 of 3 prompts (“Does the patient have sustained VT or VF after ablation” or 2 others), systematically adding 1-5 “training” examples, and repeating experiments 10 times (51,561 inquiries). Results: At baseline, GPT achieved an F1 score of 38.6%±19.4% (mean across 3 prompts; Fig B). Increasing the number of examples progressively improved mean accuracy and reduced variance. The optimal result was the illustrated prompt plus 5 in-context examples, with an F1 score of 84.6%±6.4% (p<0.05). Conclusions: ChatGPT can accurately identify VT recurrence from small numbers of complex medical notes with optimal prompt engineering. Future studies should define optimal context for different medical questions and domains. These findings pave the way for automated analysis of large medical repositories to broadly improve decision making.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.102 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.468 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.