Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Medical Health Big Data Classification Based on KNN Classification Algorithm
316
Zitationen
2
Autoren
2019
Jahr
Abstract
The rapid development of information technology has led to the development of medical informatization in the direction of intelligence. Medical health big data provides a basic data resource guarantee for medical service intelligence and smart healthcare. The classification of medical health big data is of great significance for the intelligentization of medical information. Due to the simplicity of KNN (K-Nearest Neighbor) classification algorithm, it has been widely used in many fields. However, when the sample size is large and the feature attributes are large, the efficiency of the KNN algorithm classification will be greatly reduced. This paper proposes an improved KNN algorithm and compares it with the traditional KNN algorithm. The classification is performed in the query instance neighborhood of the conventional KNN classifier, and weights are assigned to each class. The algorithm considers the class distribution around the query instance to ensure that the assigned weight does not adversely affect the outliers. Aiming at the shortcomings of traditional KNN algorithm in processing large data sets, this paper proposes an improved KNN algorithm based on cluster denoising and density cropping. The algorithm performs denoising processing by clustering, and improves the classification efficiency of KNN algorithm by speeding up the search speed of K-nearest neighbors, while maintaining the classification accuracy of KNN algorithm. The experimental results show that the proposed algorithm can effectively improve the classification efficiency of KNN algorithm in processing large data sets, and maintain the classification accuracy of KNN algorithm well, and has good classification performance.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.445 Zit.
UCI Machine Learning Repository
2007 · 24.290 Zit.
An introduction to ROC analysis
2005 · 20.660 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.117 Zit.
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
1983 · 7.063 Zit.