Paper
1 December 2021 Missing data completion method based on KNN and random forest
Songyu Zhang, Yuchen Zhou, Jinghua Yan, Fanliang Bu
Author Affiliations +
Proceedings Volume 12079, Second IYSF Academic Symposium on Artificial Intelligence and Computer Engineering; 120791M (2021) https://doi.org/10.1117/12.2622876
Event: 2nd IYSF Academic Symposium on Artificial Intelligence and Computer Engineering, 2021, Xi'an, China
Abstract
Data missing is one of the relatively difficult problems in the process of data mining,because the lack of part of the property value will seriously affect the results of data analysis. Most of the absence of missing value interpolation does not take into account the relationship between missing properties and other attributes in the sample. Aiming at this problem, this paper proposes a missing data based on KNN and random forest completion algorithm (RF-KNN) .Algorithm firstly chosen by KNN model of K value, makes it as building one of the parameters of the random forest. Then according to the characteristics of the random forest attribute, attribute classification prediction model was constructed, the missing attribute value for effective completion. Through model test and evaluation on the public data sets, the results show that compared with the traditional KNN algorithm and random forest, RF-KNN in more than one, this method is applicable to a variety of data missing value in the field of completion,which can effectively improve the efficiency of data mining analysis.
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Songyu Zhang, Yuchen Zhou, Jinghua Yan, and Fanliang Bu "Missing data completion method based on KNN and random forest", Proc. SPIE 12079, Second IYSF Academic Symposium on Artificial Intelligence and Computer Engineering, 120791M (1 December 2021); https://doi.org/10.1117/12.2622876
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Statistical modeling

Astatine

Atmospheric modeling

Data mining

Data processing

Optimization (mathematics)

Back to Top