Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

The Role of Domain-Specific Models for Synthetic Data Generation with Iterative Prompt Optimization

2025·0 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Synthetic data generation is essential for addressing data scarcity and privacy constraints in medical AI. Large language models (LLMs) can be used to generate medical texts, but their output quality depends on the prompting approach. This study investigates the effectiveness of integrating specialized medical terminology-focused LLMs with IPROPS - an iterative prompt refinement framework - for the task of generating cardiology discharge letters. We fine-tune Llama on German medical texts and compare its performance to an untuned baseline. Our findings indicate that utilizing the fine-tuned model enhances coherence and domain specificity while iterative prompt refinement significantly reduces the performance gap and can be considered as a viable alternative to fine-tuning. A Turing test with physicians confirms that, while synthetic discharge letters achieve high realism, they remain distinguishable from real samples.Code: github.com/UniTylab-HHN/IPROPS-Advanced.

Autoren

Institutionen

Heilbronn University(DE)

Themen

Machine Learning in HealthcareArtificial Intelligence in Healthcare and EducationTopic Modeling

Volltext beim Verlag öffnen

The Role of Domain-Specific Models for Synthetic Data Generation with Iterative Prompt Optimization

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen