Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparing and Analyzing the Accuracy, Comprehensiveness, and Clarity of AI-Based Microsurgical Patient-Facing Information Against Professional Organizations
1
Zitationen
10
Autoren
2024
Jahr
Abstract
Background: With the growing relevance of AI-based patient-facing information, a comparison was conducted between the accuracy, comprehensiveness, clarity, and readability of microsurgery-specific online information provided by professional organizations, specifically the American Society of Reconstructive Microsurgery (ASRM), and information generated by ChatGPT, an AI-based language model. Methods: Plastic and reconstructive surgeons assessed responses to ten microsurgery-related medical questions, randomly assigned as either ASRM or ChatGPT-generated, based on accuracy, comprehensiveness, and clarity. The surgeons were tasked with determining which source provided the highest quality patient-facing information for microsurgery. Additionally, a group of 35 individuals with no medical background (ages 17-82, mean age=44) blindly compared the materials and expressed their preferences. Readability scores were calculated using six reliability formulas: Flesch-Kincaid Grade Level, Flesch-Kincaid Readability Ease, Gunning Fog Index, Simple Measure of Gobbledygook (SMOG) Index, Coleman-Liau Index, Linsear Write Formula (LWF), and Automated Readability Index. Statistical analysis comparing the microsurgery-specific online sources was conducted using paired t-tests. Results: The results indicated statistically significant differences in favor of ChatGPT for accuracy (p<0.001), comprehensiveness (p<0.001), and clarity (p<0.05) (Figure 1A). Surgeons chose ChatGPT as the source providing the highest quality microsurgical patient-facing information 70.37% of the time when blinded to the sources (Figure 1B). Similarly, non-medical individuals representing the patient population preferred AI-generated microsurgical materials 65.24% of the time (Figure 1C). Readability scores for both ChatGPT and ASRM materials exceeded recommended levels for patient proficiency according to the six readability formulas, with AI-generated material evaluated as more complex. Conclusion: The findings of this study indicate that AI-generated patient-facing materials were perceived by surgeons as more accurate, comprehensive, and clear compared to online materials provided by ASRM. Surgeons and non-medical individuals consistently expressed a preference for AI-generated material overall. However, the readability analysis suggests that both ChatGPT and ASRM materials surpassed recommended reading levels across the six readability scores evaluated.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.