Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Detecting Duchene Muscular Dystrophy by Utilizing Text Mining and Machine Learning Techniques for Timely Diagnosis through NLP
0
Zitationen
6
Autoren
2024
Jahr
Abstract
This study presents a pioneering approach like previously done in the early detection of Duchenne Muscular Dystrophy (DMD) by harnessing the capabilities of machine learning techniques, with a primary focus on Natural Language Processing (NLP). Current approach throws light on text mining and machine learning techniques. The research, conducted on a vast dataset of gene-related information from the NCBI Gene database, introduces a novel Genomic Transformer Modal. This Modal, equipped with self-attention processes, has demonstrated exceptional performance in predicting DMD with an impressive accuracy rate of 98 percent. The Genomic Transformer Modal's architecture, with its self-attention mechanism, is tailored to capture long-range correlations within complex genomic sequences. This unique feature has paved the way for a considerable improvement in predictive accuracy compared to XGBoost Classifier, LASSO Regression, Ridge Regression and ElasticNet Model. The model's capacity to efficiently encode gene sequences into continuous vectors has enabled the extraction of highly relevant genetic patterns, leading to substantially more accurate predictions of DMD-related features. What sets the Genomic Transformer Modal apart is its remarkable adaptability to high-dimensional feature spaces, a common challenge in gene-related NLP tasks. The attention-based mechanism within the model allows it to selectively focus on the most informative elements within gene sequences, effectively managing the inherent complexity of genetic data. By zeroing in on these critical segments, the Genomic Transformer Modal excels in identifying vital genetic variations and patterns linked to DMD, particularly within the Dystrophin Gene. Furthermore, the Modal showcases superior generalization capabilities, significantly reducing overfitting and the risks of model bias. In the realm of precision medicine, where precise predictions are paramount for early diagnosis and personalized treatment strategies, this achievement holds immense promise.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.445 Zit.
UCI Machine Learning Repository
2007 · 24.290 Zit.
An introduction to ROC analysis
2005 · 20.602 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.103 Zit.
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
1983 · 7.061 Zit.