OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 16.03.2026, 10:34

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Large language models in solving clinical dilemmas – advantages and drawbacks

2024·0 Zitationen
Volltext beim Verlag öffnen

0

Zitationen

10

Autoren

2024

Jahr

Abstract

<bold>Background:</bold> Large language models (LLMs) have shown potential to assist clinical decisions but concerns about underlying mechanism (‘black box phenomenon’) have provoked unease. <bold>Aims:</bold> To explore the LLM black box in the context of complex decision making in paediatric respiratory medicine (PRM) by comparison against trainee doctors (TDs), qualitative assessment of responses and a deep-dive into their sources. <bold>Methods:</bold> Six complex PRM scenarios were posed to 10 TDs and 3 LLMs. Six PRM experts provided detailed comments on the responses, which were analysed qualitatively using DisplayR™. Screen recordings of TDs’ search and sources provided by LLMs were analysed for further insight. <bold>Results:</bold> Word-cloud analysis shows features of LLM and TD responses (Table). ChatGPT™ provided useful responses but did not quote sources. TDs sourced responses from patient-facing websites, abstracts and review articles (Figure), while Bard™ and Bing™ quoted journals or evidence-based reviews. <table-wrap><object-id>erj;64/suppl_68/PA4379/T1</object-id><object-id>T1</object-id><object-id>t1</object-id><table><colgroup><col></col><col></col><col></col><col></col><col></col></colgroup><tbody><tr><td></td><td>ChatGPT</td><td>Bard</td><td>Bing</td><td>TDs</td></tr><tr><td>Structure</td><td>Good</td><td>Good</td><td>Adequate</td><td>Poor</td></tr><tr><td>Omissions</td><td>No</td><td>Occasional</td><td>Yes</td><td>Variable</td></tr><tr><td>New Advances</td><td>Pre-2021</td><td>Yes</td><td>Yes</td><td>Variable</td></tr><tr><td>Incorrect</td><td>No</td><td>No</td><td>No</td><td>No</td></tr><tr><td>Sentiment score</td><td>0.85</td><td>−0.15</td><td>−1.0</td><td>−0.34</td></tr></tbody></table></table-wrap> <fig><object-id>erj;64/suppl_68/PA4379/F1</object-id><object-id>F1</object-id><object-id>F1</object-id><graphic></graphic></fig> <bold>Conclusion:</bold> LLMs (particularly ChatGPT and Bard) already surpass TDs in quality of responses. We demonstrate utility of LLMs for non-expert clinicians faced with complex medical scenarios. We explore the drawbacks of the individual LLMs while noting that this is a rapidly changing area.

Ähnliche Arbeiten