Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Recovering Clinical Detail in AI-Generated Responses for Low Back Pain Through Prompt Design
0
Zitationen
5
Autoren
2026
Jahr
Abstract
Abstract Introduction Large language models are increasingly being used in healthcare. In interventional pain medicine, clinical reasoning is essential for procedural planning. Prior studies show that simplified prompts reduce clinical detail in AI-generated responses. It remains unclear whether this reflects knowledge loss or simply prompt-driven suppression of information. Methods We performed a controlled comparative study using 15 standardized low back pain questions representing common interventional pain questions. Each question was submitted to ChatGPT under three conditions, professional-level prompt (DP), fourth-grade reading-level prompt (D4), and clinician-directed rewriting of the D4 response to a medical level (U4→MD). No follow-up prompting was allowed. Three physicians independently rated responses for accuracy using a 0–2 ordinal scale. Clinical completeness was determined by consensus. Word count and Flesch–Kincaid Grade Level (FKGL) were also measured. Paired t-tests compared conditions. Results Accuracy was highest with professional prompting (1.76). Accuracy declined with the fourth-grade prompt (1.33; p = 0.00086). When simplified responses were rewritten for clinicians, accuracy returned to baseline (1.76; p ≈ 1.00 vs DP). Clinical completeness followed the same pattern showing DP 80.0%, D4 6.7%, U4→MD 73.3%. Fourth-grade responses were shorter and less complex. Upscaled responses were more complex and similar in length to professional responses. Inter-rater reliability was low (Fleiss’ κ = 0.17), but trends were consistent across conditions. Conclusions Reduced clinical detail under simplified prompts appears to reflect constrained output rather than loss of knowledge. Clinician-directed reframing restores omitted content. LLM performance in interventional pain depends strongly on prompt design and intended audience.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.626 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.532 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.046 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.843 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.