OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 19.04.2026, 03:27

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Query, Don't Train: Privacy-Preserving Tabular Prediction from EHR Data via SQL Queries

2025·0 Zitationen·ArXiv.orgOpen Access
Volltext beim Verlag öffnen

0

Zitationen

7

Autoren

2025

Jahr

Abstract

Electronic health records (EHRs) contain richly structured, longitudinal data essential for predictive modeling, yet stringent privacy regulations (e.g., HIPAA, GDPR) often restrict access to individual-level records. We introduce \textbf{Query, Don't Train} (QDT): a \textbf{structured-data foundation-model interface} enabling \textbf{tabular inference} via LLM-generated SQL over EHRs. Instead of training on or accessing individual-level examples, QDT uses a large language model (LLM) as a schema-aware query planner to generate privacy-compliant SQL queries from a natural language task description and a test-time input. The model then extracts summary-level population statistics through these SQL queries, and the LLM performs chain-of-thought reasoning over the results to make predictions. This inference-time-only approach enables prediction without supervised model training, ensures interpretability through symbolic, auditable queries, naturally handles missing features without imputation or preprocessing, and effectively manages high-dimensional numerical data to enhance analytical capabilities. We validate QDT on the task of 30-day hospital readmission prediction for Type 2 diabetes patients using a MIMIC-style EHR cohort, achieving F1 = 0.70, which outperforms TabPFN (F1 = 0.68). To our knowledge, this is the first demonstration of LLM-driven, privacy-preserving structured prediction using only schema metadata and aggregate statistics -- offering a scalable, interpretable, and regulation-compliant alternative to conventional foundation-model pipelines.

Ähnliche Arbeiten

Autoren

Themen

Machine Learning in HealthcarePrivacy-Preserving Technologies in DataArtificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen