Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Comparative Analysis of the Accuracy and Readability of Popular Artificial Intelligence-Chat Bots for Inguinal Hernia Management

2025·0 Zitationen·The American Surgeon

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

BackgroundArtificial intelligence (AI), particularly large language models (LLMs), has gained attention for its clinical applications. While LLMs have shown utility in various medical fields, their performance in inguinal hernia repair (IHR) remains understudied. This study seeks to evaluate the accuracy and readability of LLM-generated responses to IHR-related questions, as well as their performance across distinct clinical categories.MethodsThirty questions were developed based on clinical guidelines for IHR and categorized into four subgroups: diagnosis, perioperative care, surgical management, and other. Questions were entered into Microsoft Copilot®, Google Gemini®, and OpenAI ChatGPT-4®. Responses were anonymized and evaluated by six fellowship-trained, minimally invasive surgeons using a validated 5-point Likert scale. Readability was assessed with six validated formulae.ResultsGPT-4 and Gemini outperformed Copilot in overall mean scores for response accuracy (Copilot: 3.75 ± 0.99, Gemini: 4.35 ± 0.82, and GPT-4: 4.30 ± 0.89; P < 0.001). Subgroup analysis revealed significantly higher scores for Gemini and GPT-4 in perioperative care (P = 0.025) and surgical management (P < 0.001). Readability scores were comparable across models, with all responses at college to college-graduate reading levels.DiscussionThis study highlights the variability in LLM performance, with GPT-4 and Gemini producing higher-quality responses than Copilot for IHR-related questions. However, the consistently high reading level of responses may limit accessibility for patients. These findings underscore the potential of LLMs to serve as valuable adjunct tools in surgical practice, with ongoing advancements expected to further enhance their accuracy, readability, and applicability.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationSurgical Simulation and TrainingMedical Imaging and Analysis

Volltext beim Verlag öffnen

A Comparative Analysis of the Accuracy and Readability of Popular Artificial Intelligence-Chat Bots for Inguinal Hernia Management

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen