Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Performance of large language models in non-English medical ethics-related multiple choice questions: comparison of ChatGPT performance across versions and languages
0
Zitationen
3
Autoren
2025
Jahr
Abstract
ChatGPT demonstrated substantial improvements in medical ethics MCQ performance across versions, particularly in terms of consistency and accuracy. However, performance disparities between languages and reduced accuracy under masked answer conditions highlight the ongoing limitations of non-English ethical reasoning and context recognition. These findings emphasize the need for further research on language-sensitive fine-tuning and the evaluation of LLMs in specialized ethical domains. The findings suggest that advanced LLMs may serve as valuable supplementary tools in medical education and clinical ethics training. At the same time, the observed language disparities call for context-sensitive adaptations to prevent inequities in practice.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.250 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.109 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.482 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.434 Zit.