OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 23.03.2026, 04:53

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Bridging Large Language Models and Single-Cell Transcriptomics in Dissecting Selective Motor Neuron Vulnerability.

2025·0 Zitationen·PubMed
Volltext beim Verlag öffnen

0

Zitationen

6

Autoren

2025

Jahr

Abstract

Understanding cell identity and function through single-cell level sequencing data remains a key challenge in computational biology. We present a novel framework that leverages gene-specific textual annotations from the NCBI Gene database to generate biologically contextualized cell embeddings. For each cell in a single-cell RNA sequencing (scRNA-seq) dataset, we rank genes by expression level, retrieve their corresponding NCBI gene descriptions, and transform these descriptions into vector embedding representations using large language models (LLMs). The models used include OpenAI's text-embedding-ada-002, textembedding-3-small and text-embedding-3-large (Jan 2024), as well as domain-specific models BioBERT and SciBERT. Embeddings are computed via an expression-weighted average across the top-N most highly expressed genes in each cell, providing a compact, semantically rich representation. This multimodal strategy bridges structured biological data with state-of-the-art language modeling, enabling more interpretable downstream applications such as cell type clustering, cell vulnerability dissection, and trajectory inference.

Ähnliche Arbeiten

Autoren

Themen

Epigenetics and DNA MethylationTopic ModelingArtificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen