Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
ChatGPT‐4 in Clinical Neurology: An Alzheimer’s Disease Information Quality Analysis
2
Zitationen
4
Autoren
2024
Jahr
Abstract
Abstract Background The integration of Large Language Models (LLMs) like ChatGPT‐4 in clinical settings offers potential enhancements in medical practice, particularly in neurology and dementia care. There is rising public usage of ChatGPT‐4 for preliminary information gathering. This study aims to evaluate the effectiveness of ChatGPT‐4 in responding to neurology‐focused queries, with an emphasis on Alzheimer’s Disease (AD). It addresses the challenges of accuracy and reliability in artificial intelligence (AI)‐generated medical information, which are crucial for practical clinical applications. Method This investigation utilized ChatGPT‐4 to respond to six diverse neurology‐related questions covering symptomatology and caregiver guidance for AD. The responses were assessed using a context‐adapted DISCERN and AGREE II scoring framework, which are rating systems for evaluating the clarity and appropriateness of healthcare information and advice. Two blinded neurologists independently reviewed and scored the AI’s responses. Statistical analyses, including correlation, variance, and linear regression, were conducted to quantify the relationship between the AI’s adherence to clinical guidelines (AGREE II scores) and the quality of information provided (DISCERN scores). Result ChatGPT‐4’s responses achieved a moderate level of alignment with clinical guidelines, indicated by a total AGREE average score of 2.27/7. The general quality rating average was 5.25/7, reflecting moderate accuracy and relevance. The combined AGREE and rating average score was 2.51/7, with a total DISCERN average of 2.14/5. Statistical analysis revealed a moderate positive correlation (Pearson coefficient: 0.58) between AGREE and DISCERN scores. Variance analysis showed low variability in AGREE scores (0.0499) and higher variability in DISCERN scores (0.2200). Regression analysis indicated that AGREE scores moderately predicted DISCERN scores (R² = 0.334), but the relationship was not statistically significant (p > 0.05). Conclusion ChatGPT‐4 demonstrates potential in providing neurology‐specific information, particularly for AD, with moderate effectiveness. Healthcare professionals should employ AI‐generated information cautiously, treating it as a supplement to established clinical guidelines and professional judgment. It is essential to ensure that the public is well informed about the limitations and appropriate uses of AI as a tool for health information.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.490 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.376 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.832 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.553 Zit.