A novel optimization algorithm for the missing data in HCC based on multiple imputation and genetic algorithm |
Paper ID : 1004-ICCI2021 (R1) |
Authors: |
Yasser Salaheldin Mohammed *1, Mohamed Hammad2, Hatem Abdelkader1 1Information Systems, Faculty of Computers and Information, Menoufia university 2Information Technology Department, Faculty of Computers and Information |
Abstract: |
Hepatocellular carcinoma (HCC) is the threat of liver, which is considered one of the diseases devastating to human health that leads to death. Therefore, discovering HCC early is essential, this will not begin without complete, adequate and reliable data. Hence, it is imperative to improve missing data completion processes to provide a more reliable data in detection phase. In this research, we offer a unique method that combines multiple imputation with a genetic algorithm to optimize multiple regression imputation processes and obtain the optimum fitness values for missing data from patients. We used 583 patient records from a public, available database to train and evaluate our proposed algorithm, separated into 416 liver patient records and 167 non-liver patient records. Results are proven that the proposed approach has the most improvement for missing data results. We were able to reach fitness value to 233 instead of using normal equation in multiple imputation which gave 92.72 as an uttermost fitness value of it. The suggested model may be validated using a large database and used in HCC laboratories to assist doctors in making an accurate diagnosis. |
Keywords: |
HCC; Multiple Imputation; Fitness Value; Multiple Regression; Genetic Algorithm; Missing Data; Optimization |
Status : Paper Accepted |