Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Assessing DeepSeek-R1 for Clinical Decision Support in Multidisciplinary Laboratory Medicine
2
Zitationen
3
Autoren
2025
Jahr
Abstract
Background: Recent advancements in artificial intelligence (AI), particularly with large language models (LLMs), are transforming healthcare by enhancing diagnostic decision-making and clinical workflows. The application of LLMs like DeepSeek-R1 in clinical laboratory medicine demonstrates potential for improving diagnostic accuracy, supporting decision-making, and optimizing patient care. Objective: This study evaluates the performance of DeepSeek-R1 in analyzing clinical laboratory cases and assisting with medical decision-making. The focus is on assessing its accuracy and completeness in generating diagnostic hypotheses, differential diagnoses, and diagnostic workups across diverse clinical cases. Methods: , which includes comprehensive case histories and laboratory findings. DeepSeek-R1 was queried independently for each case three times, with three specific questions regarding diagnosis, differential diagnoses, and diagnostic tests. The outputs were assessed for accuracy and completeness by senior clinical laboratory physicians. Results: DeepSeek-R1 achieved an overall accuracy of 72.9% (95% CI [69.9%, 75.7%]) and completeness of 73.4% (95% CI [70.5%, 76.2%]). Performance varied by question type: the highest accuracy was observed for diagnostic hypotheses (85.7%, 95% CI [81.2%, 89.2%]) and the lowest for differential diagnoses (55.0%, 95% CI [49.3%, 60.5%]). Notable variations in performance were also seen across disease categories, with the best performance observed in genetic and obstetric diagnostics (accuracy 93.1%, 95% CI [84.0%, 97.3%]; completeness 86.1%, 95% CI [76.4%, 92.3%]). Conclusion: DeepSeek-R1 demonstrates potential for a decision-support tool in clinical laboratory medicine, particularly in generating diagnostic hypotheses and recommending diagnostic workups. However, its performance in differential diagnosis and handling specific clinical nuances remains limited. Future work should focus on expanding training data, integrating clinical ontologies, and incorporating physician feedback to improve real-world applicability. DeepSeek-R1 and the new versions under development may be promising tools for non-medical professionals and professionals in medical laboratory diagnoses.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.545 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.436 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.935 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.589 Zit.