Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
An Integrated Two-Layered Voting (TLV) Framework for Coronary Artery Disease Prediction Using Machine Learning Classifiers
27
Zitationen
2
Autoren
2024
Jahr
Abstract
Cardiovascular problems have emerged as a significant concern, adversely impacting individuals across all age groups. Several recent research studies have used Machine learning (ML) techniques to design decision-making systems for the tremendous data in the medical sector. Although these works obtained promising results, most of the studies focused on small datasets. Since the size of the dataset affects algorithm performance, this study used two datasets, such as Kaggle’s heart disease dataset of over 70,000 records and UCI’s heart disease dataset of 1025 records. In addition to the old features the Pulse Pressure (PP), the Body Mass Index (BMI), and the Mean Arterial Pressure (MAP), three new features are introduced to improve the results. This paper proposes the TLV (Two-Layer Voting) model, which is an ensemble method of hard and soft voting. As part of layer 1, features are shortlisted by soft and hard voting using three statistical methods, including the ANOVA f-test, Chi-squared test, and Mutual Information. In layer 2, soft voting and hard voting performance are compared, which incorporates Multi-Layer Perceptron, Decision Tree, Support Vector Classifier, and Random Forest algorithms. Classification algorithms are hyper-tuned using the GridSearchCV method in the second layer. Using UCI’s heart disease dataset and Kaggle’s CVD dataset, the proposed TLV methodology with soft voting provided the highest accuracy of 99.03% and 88.09%, respectively. The proposed model significantly outperforms existing CAD disease prediction studies.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.449 Zit.
UCI Machine Learning Repository
2007 · 24.319 Zit.
An introduction to ROC analysis
2005 · 20.943 Zit.
Prediction of Coronary Heart Disease Using Risk Factor Categories
1998 · 9.604 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.181 Zit.