Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Performance of large language models in preoperative and postoperative counselling for aesthetic facial procedures
0
Zitationen
4
Autoren
2026
Jahr
Abstract
Large language models (LLMs) are increasingly used in healthcare, but their role in aesthetic surgical procedures remains unexplored. These interventions present unique challenges, marked by high patient expectations, emotionally charged decision-making, and subtle yet impactful outcomes on self-perception and psychosocial health. This cross-sectional in silico study evaluated the performance of ChatGPT-4 (OpenAI, 2025), DeepSeek V3 (DeepSeek AI/High-Flyer, 2025), and Gemini 2.5 Pro Experimental (Google, 2025) in preoperative and postoperative counselling for aesthetic facial surgery. Twenty-six standardised patient-oriented questions were submitted, and the anonymised responses of the chatbots were independently assessed by two calibrated oral and maxillofacial surgeons across four domains: accuracy, empathy, readability (Flesch-Kincaid Reading Ease (FKRE) and Grade Level (FKGL)), and referencing reliability (including the identification of fabricated or non-verifiable citations, a phenomenon referred to as "hallucination" in LLM outputs). Statistical tests included Kruskal-Wallis, Mann-Whitney U with Bonferroni correction, Spearman correlation, and chi-squared. DeepSeek achieved the highest accuracy (4.77 (0.51), p = 0.0078) and readability (FKRE 2.92 (0.27), p < 0.00001), while Gemini outperformed in empathy (4.08 (0.89), p < 0.001). GPT-4 produced the most hallucinated citations (36%) compared with Gemini (14%) and DeepSeek (8.8%) (p < 0.00001). A negative correlation between empathy and readability (r = -0.34, p = 0.002) suggested a trade-off between affective tone and accessibility. Overall, LLMs generated satisfactory counselling responses with distinct performance profiles, supporting their potential in patient-centred communication while reinforcing the need for human oversight.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.