OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 08.04.2026, 16:12

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

UPxSocio at NTCIR-18 MedNLP-CHAT Task: Similarity-Based Few-Shot Example Selection for Prompt-Based Detection

2025·1 Zitationen·Institutional Repositories DataBase (IRDB)Open Access
Volltext beim Verlag öffnen

1

Zitationen

7

Autoren

2025

Jahr

Abstract

This paper presents our submission to the MedNLP-CHAT Task at NTCIR-18, which focuses on detecting medical, ethical, and legal risks in chatbot-generated responses. We propose a two-step prompt-based classification framework using the Gemini-1.5-flash model. The method first generates support statements to guide reasoning, which are then integrated into a few-shot prompt for final classification. We evaluated our approach on the English versions of the Japanese and German subtasks, submitting two systems per subtask that varied in example selection strategy and label distribution. Our systems achieved strong performance in detecting medical risks—particularly in the German subtask—while ethical and legal risks were more challenging. To better understand the design factors influencing performance, we conducted ablation studies across 24 prompt variants. Logistic regression and CHAID analyses revealed that accuracy depends on complex interactions between subtask language, example similarity, actual label, and selection method. Higher similarity improves classification of risk-present cases but harms performance on risk-absent cases, indicating a trade-off between recall and false positives. The $k$-nearest method was more effective under high similarity, while $k$-spread offered balanced results across classes. Although the two-step prompting strategy did not show a statistically significant advantage overall, the best-performing configuration used five support statements, with diminishing gains beyond that. Our findings suggest that optimized prompt design, particularly with controlled support and example selection, can improve risk detection without requiring large-scale training or high computational resources.

Ähnliche Arbeiten

Autoren

Themen

Artificial Intelligence in Healthcare and EducationExplainable Artificial Intelligence (XAI)Topic Modeling
Volltext beim Verlag öffnen