https://doi.org/10.1140/epjs/s11734-025-01838-y
Regular Article
Enhancing anemia diagnosis using ensemble machine learning and feature selection techniques on CBC data
Faculty of Engineering, Computer Engineering, Cankiri Karatekin University, Cankiri, Turkey
a
mustafateke@karatekin.edu.tr
Received:
21
March
2025
Accepted:
27
June
2025
Published online:
8
August
2025
Anemia is a prevalent health condition that necessitates prompt and accurate diagnosis for effective management. This study explores the use of ensemble machine learning methods combined with statistical feature selection techniques to enhance the diagnosis of anemia using complete blood count (CBC) data. A dataset consisting of 1280 patient records with 14 hematological features across nine anemia classes received analysis from k-Nearest Neighbors (k-NN) and Support Vector Machine (SVM), Random Forest algorithms, and ensemble approaches. Selecting the most informative attributes became essential through using feature selection methods (ANOVA, Chi-square, and Kruskal–Wallis) which resulted in improved model performance. The Synthetic Minority Over-sampling Technique (SMOTE) improved model accuracy through balancing classes between samples, so that accuracy rates reached up to 99.67%. The research findings demonstrate that ensemble learning with effective feature selection as well as data augmentation approaches create opportunities for early accurate robust diagnosis of anemia which supports enhanced clinical practice.
Supplementary Information The online version contains supplementary material available at https://doi.org/10.1140/epjs/s11734-025-01838-y.
Copyright comment Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
© The Author(s), under exclusive licence to EDP Sciences, Springer-Verlag GmbH Germany, part of Springer Nature 2025
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.