Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study
2024·41 Zitationen·Journal of Medical Internet ResearchOpen Access
Volltext beim Verlag öffnen41
Zitationen
9
Autoren
2024
Jahr
Abstract
By evaluating LLMs in generating responses to patients' laboratory test result-related questions, we found that, compared to other 4 LLMs and human answers from a Q&A website, GPT-4's responses were more accurate, helpful, relevant, and safer. There were cases in which GPT-4 responses were inaccurate and not individualized. We identified a number of ways to improve the quality of LLM responses, including prompt engineering, prompt augmentation, retrieval-augmented generation, and response evaluation.
Ähnliche Arbeiten
Autoren
Institutionen
Themen
Topic ModelingArtificial Intelligence in Healthcare and EducationMachine Learning in Healthcare