Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
The Role of Domain-Specific Models for Synthetic Data Generation with Iterative Prompt Optimization
0
Zitationen
4
Autoren
2025
Jahr
Abstract
Synthetic data generation is essential for addressing data scarcity and privacy constraints in medical AI. Large language models (LLMs) can be used to generate medical texts, but their output quality depends on the prompting approach. This study investigates the effectiveness of integrating specialized medical terminology-focused LLMs with IPROPS - an iterative prompt refinement framework - for the task of generating cardiology discharge letters. We fine-tune Llama on German medical texts and compare its performance to an untuned baseline. Our findings indicate that utilizing the fine-tuned model enhances coherence and domain specificity while iterative prompt refinement significantly reduces the performance gap and can be considered as a viable alternative to fine-tuning. A Turing test with physicians confirms that, while synthetic discharge letters achieve high realism, they remain distinguishable from real samples.Code: github.com/UniTylab-HHN/IPROPS-Advanced.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.210 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.586 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.100 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.466 Zit.
Artificial intelligence in healthcare: past, present and future
2017 · 4.382 Zit.