Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Noise, Distraction, and Mitigation: An Analysis of RAG Failure Modes in Medical Question Answering

2026·0 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Retrieval Augmented Generation (RAG) is an idea to give Large Language Models (LLMs) a much needed boost by tapping into external knowledge sources. However, there is still much to learn about how well it works in the field of medicine. At present, we do not have enough information to say for certain. In this research, we take a top-of-the-line medical LLM called MMed-Llama-3-8B and test on a 500 question challenge called the PubMedQA benchmark. We found a significant performance degradation when integrating RAG system to the model. When the RAG model provides additional documents, the baseline accuracy of 68.8 % has been dropped to as low as 17.6 %. To investigate the source of error, we analyzed 165 cases in detail where the performance was degraded by RAG system. The primary issues we found were that RAG retrieved some documents that diverted the model from the context in 41.8 % of cases, whereas in 37.6 % cases the retrieved documents contradicted knowledge. To mitigate this, we devised a better way to prompt the model that could improve the performance of the RAG setup to 59.6 % a 186 % boost over just using RAG as it is. Our research reveals the fact that within domains such as medicine, the mere application of RAG is inadequate and the models should be able to manage noise that are inherent in data from external sources.

Autoren

Institutionen

North South University(BD)

Themen

Topic ModelingArtificial Intelligence in Healthcare and EducationExpert finding and Q&A systems

Volltext beim Verlag öffnen

Noise, Distraction, and Mitigation: An Analysis of RAG Failure Modes in Medical Question Answering

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen