Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluating the Potential of LLMs and ChatGPT on Medical Diagnosis and Treatment
29
Zitationen
4
Autoren
2023
Jahr
Abstract
We evaluate the validity, accuracy, and usefulness of ChatGPT-returned medical diagnosis of lung disease based on symptoms described by a human. Specifically, Tuberculosis and its symptoms are selected as the test case and our evaluation follows the directions of (i) medical validity and accuracy of the returned diagnosis in terms of both context and references, (ii) its usefulness to both doctors and patients and (iii) the economic value added to the healthcare system. It is shown that ChatGPT performs well in diagnosing Tuberculosis, but its performance improves when supervised by a human medical expert. In the interest of adding reproducibility and comparability, we propose a novel general evaluation procedure for the medical domain, to be followed when interacting with Large Language Models. This procedure integrates the various steps employed in our evaluation process and encompasses the review indices utilized for quantifying the outcome.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.