Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Assessing the Accuracy of Large Language Models on European Guidelines for Cervical Cancer: An In Silico Benchmarking Study
0
Zitationen
15
Autoren
2025
Jahr
Abstract
All models demonstrated suboptimal accuracy in aligning with clinical guidelines. ChatGPT 4.0 was the most accurate and consistent whereas DeepSeek R1 underperformed. Despite similar reliability across models, expert oversight remains essential to ensure safe clinical application and prevent misinformation.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.200 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.051 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.416 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.410 Zit.
Autoren
Institutionen
- Centre National de la Recherche Scientifique(FR)
- Agostino Gemelli University Polyclinic(IT)
- Laboratoire des Sciences de l'Ingénieur, de l'Informatique et de l'Imagerie(FR)
- Institut de Recherche contre les Cancers de l’Appareil Digestif(FR)
- Institut de Chirurgie Guidée par l'Image(FR)
- Université de Strasbourg(FR)
- Saint Camillus International University of Health and Medical Sciences(IT)
- Università Cattolica del Sacro Cuore(IT)
- Charles University(CZ)
- General University Hospital in Prague(CZ)