Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparison of physician and large language model chatbot responses to online ear, nose, and throat inquiries
4
Zitationen
8
Autoren
2025
Jahr
Abstract
Large language models (LLMs) can potentially enhance the accessibility and quality of medical information. This study evaluates the reliability and quality of responses generated by ChatGPT-4, an LLM-driven chatbot, compared to those written by physicians, focusing on otorhinolaryngological advice in real-world, text-based workflows. Responses from a public social media forum were anonymized, and ChatGPT-4 generated corresponding replies. A panel of seven board-certified otorhinolaryngologists assessed both sets of responses using six criteria: overall quality, empathy, alignment with medical consensus, information accuracy, inquiry comprehension, and harm potential. Ordinal logistic regression analysis identified factors influencing response quality. ChatGPT-4 responses were preferred in 70.7% of cases and were significantly longer (median: 162 words) than physician responses (median: 67 words; P < .0001). The chatbot's responses received higher ratings across all criteria, with key predictors of this higher quality being greater empathy, stronger alignment with medical consensus, lower potential for harm, and fewer inaccuracies. ChatGPT-4 consistently outperformed physicians in generating responses that adhered to medical consensus, demonstrated accuracy, and conveyed empathy. These findings suggest that integrating AI tools into text-based healthcare consultations could help physicians better address complex, nuanced inquiries and provide high-quality, comprehensive medical advice.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.391 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.257 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.685 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.501 Zit.