Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Leveraging Open-Source Large Language Models for Non-Technical Researchers for Data Visualization: Performance Analysis in Secondary Data Use Systems (Preprint)
0
Zitationen
3
Autoren
2025
Jahr
Abstract
<sec> <title>BACKGROUND</title> Medical institutions increasingly prioritize the secondary use of clinical data for research purposes, balancing research advancements with strict data protection regulations. In this context, the German University Medical Center Hamburg-Eppendorf (UKE) has developed the data hotel. This platform enables researchers to access pseudonymized clinical data for retrospective studies without requiring additional ethical or legal clearances. </sec> <sec> <title>OBJECTIVE</title> To enhance the platform’s usability, this study evaluated the integration of a large language model (LLM)-based code chatbot designed to generate Python code for data visualization and analysis, specifically addressing challenges posed by the heterogeneity and complexity of clinical datasets. </sec> <sec> <title>METHODS</title> Three locally installed, open-source LLMs – Mistral-7B-Instruct, Llama3-8B-Instruct, and CodeLlama-7B-Instruct – were assessed for their ability to answer basic and complex research queries using different prompting techniques, including zero-shot, one-shot, few-shot, and chain-of-thought prompting. Performance, including reproducibility and repeatability, was measured against ground truth by using different phrasing styles. Phrasings were also translated into German to evaluate the model’s language dependency. </sec> <sec> <title>RESULTS</title> Llama3, especially when paired with few-shot prompting, outperformed the other models in generating accurate and complete code for basic queries across two datasets. However, performance declined for complex queries, with accuracy being significantly influenced by the dataset's structure and quality. German prompts yielded lower scores compared to English, highlighting the limitations of the linguistic model. The study identified the need for better data preparation, tailored prompting strategies, and advanced software architectural approaches to address dataset-specific challenges. </sec> <sec> <title>CONCLUSIONS</title> This work demonstrates the feasibility of integrating LLM-based tools into clinical data platforms, empowering non-technical researchers with a scalable approach for secondary data analysis. It emphasizes the importance of data quality and contextual understanding. Future directions include expanding prompt libraries, incorporating advanced visualization tools, implementing agentic systems, and exploring higher-parameter models to improve versatility and robustness. </sec>
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.402 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.270 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.702 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.507 Zit.