Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
forgeNet: a graph deep neural network model using tree-based ensemble classifiers for feature graph construction
30
Zitationen
2
Autoren
2020
Jahr
Abstract
MOTIVATION: A unique challenge in predictive model building for omics data has been the small number of samples (n) versus the large amount of features (p). This 'n≪p' property brings difficulties for disease outcome classification using deep learning techniques. Sparse learning by incorporating known functional relationships between the biological units, such as the graph-embedded deep feedforward network (GEDFN) model, has been a solution to this issue. However, such methods require an existing feature graph, and potential mis-specification of the feature graph can be harmful on classification and feature selection. RESULTS: To address this limitation and develop a robust classification model without relying on external knowledge, we propose a forest graph-embedded deep feedforward network (forgeNet) model, to integrate the GEDFN architecture with a forest feature graph extractor, so that the feature graph can be learned in a supervised manner and specifically constructed for a given prediction task. To validate the method's capability, we experimented the forgeNet model with both synthetic and real datasets. The resulting high classification accuracy suggests that the method is a valuable addition to sparse deep learning models for omics data. AVAILABILITY AND IMPLEMENTATION: The method is available at https://github.com/yunchuankong/forgeNet. CONTACT: tianwei.yu@emory.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Ähnliche Arbeiten
Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles
2005 · 56.074 Zit.
Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks
2003 · 53.748 Zit.
Gene Ontology: tool for the unification of biology
2000 · 44.346 Zit.
The Protein Data Bank
2000 · 39.515 Zit.
KEGG: Kyoto Encyclopedia of Genes and Genomes
2000 · 38.789 Zit.