Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Study and analysis of feature selection problems and impact of bias in machine learning disease prediction models
1
Zitationen
4
Autoren
2024
Jahr
Abstract
In the current scenario machine learning is the branch of artificial intelligence being used in every field and medicine is one of them. In medical science, the use of machine learning techniques aims to improve patient care by collecting, and analyzing patient data, and designing advanced and intelligent tools and/or devices for disease detection using collective experience. ML technology detects patterns associated with specific diseases by analyzing large datasets that include various patient records, such as diabetes, blood pressure, cholesterol, X-rays, MRIs, CT scans, imaging data, and genomic information. ML algorithms compute the primary symptoms of the disease. Based on these calculations the disease is identified. Here it is necessary to have sufficient dataset and/or features for computation. The understanding of the ML model depends on the underlying feature to be used to identify the related problem. The fairness of a machine learning algorithm depends on which symptoms are selected to determine any disease. The selection of features for ML models is an important task, more or less features can make the model underfit or overfit. Incorrect determination of selected features can introduce bias into the model which can greatly affect the accuracy of the model. If the bias in the machine learning model is not properly tuned or the bias is tuned too high or too low then the prediction does not cover the underlined pattern. Diseases arise in different circumstances; each disease has its special characteristics. To cover all the basic parameters of each disease is a very tough task. If a basic attribute is missed and/or an attribute that has no relation to the disease is captured then the desired result of the model may be affected. In the proposed research paper, the feature selection problem and bias effect have been analyzed through the Support Vector Machine (SVM) and Logistic Regression (LR) algorithm.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.445 Zit.
UCI Machine Learning Repository
2007 · 24.290 Zit.
An introduction to ROC analysis
2005 · 20.602 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.103 Zit.
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
1983 · 7.061 Zit.