Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
How much Medical Knowledge do LLMs have? An Evaluation of Medical Knowledge Coverage for LLMs
1
Zitationen
4
Autoren
2025
Jahr
Abstract
Previous evaluation frameworks for large language models (LLMs) have mostly relied on existing question-answering benchmarks, which are primarily task-oriented rather than knowledge-oriented.In the medical domain, however, the effective deployment of LLMs necessitates a thorough evaluation of their medical knowledge coverage.To this end, we propose a systematic evaluation framework, MedKGEval, to assess the coverage of medical knowledge in LLMs through the lens of medical knowledge graphs (KGs).MedKGEval transforms various levels of knowledge (entity-level, relation-level, and subgraph-level) from the medical KG into distinct groups of question-answer pairs, which serve as comprehensive evaluation benchmarks.In addition to traditional task-oriented evaluations, MedKGEval introduces a novel knowledge-oriented evaluation approach that encompasses the assessment of knowledge coverage across entities, relations, and triples.This multi-aspect evaluation approach allows for a more nuanced understanding of LLMs' knowledge coverage in the medical context.Using these benchmarks, we conduct a systematic evaluation of 11 LLMs from multiple perspectives, revealing insights into their strengths and weaknesses in medical knowledge memorization and reasoning.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.102 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.468 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.