A Review of Feature Selection Techniques for Medical Datasets
DOI:
https://doi.org/10.71229/93rpkv96Keywords:
Feature selection, Filter methods, Wrapper methods, Embedded methods.Abstract
Feature selection is an important part of machine learning, especially in medical research with high-dimensional datasets. Selecting relevant features can improve the accuracy, interpretability, and efficiency of prediction models. There are three feature selection methods, i.e., filter, wrapper, and embedding methods. In this paper, we analyze these algorithms applied to medical datasets; the basic idea of the most popular algorithms in each category is described, and advantages, efficiency, and disadvantages are discussed.
The paper also describes the problem of complexity, the number of samples, and class imbalance in medical datasets. Finally, the analysis offers some clues for the most effective feature selection strategies and possible paths for future research in the medical field. This review is expected to serve as a useful guide for researchers and practitioners who are interested in choosing the best features to optimize the machine learning models of medical applications.
References
[1] Rane, J., R.A. Chaudhari, and N.L. Rane, Data Analysis and Information Processing Frameworks for Ethical Artificial Intelligence Implementation: Machine-Learning Algorithm Validation in Clinical Research Settings. Ethical Considerations and Bias Detection in Artificial Intelligence/Machine Learning Applications, 2025: p. 192. DOI: https://doi.org/10.70593/978-93-7185-870-0
[2] Cheng, X., A comprehensive study of feature selection techniques in machine learning models. Available at SSRN 5154947, 2024. DOI: https://doi.org/10.2139/ssrn.5154947
[3] Hancox, Z., et al., A systematic review of networks for prognostic prediction of health outcomes and diagnostic prediction of health conditions within Electronic Health Records. Artificial Intelligence in Medicine, 2024. 158: p. 102999. DOI: https://doi.org/10.1016/j.artmed.2024.102999
[4] Büyükkeçeci, M. and M.C. Okur, A comprehensive review of feature selection and feature selection stability in machine learning. Gazi University Journal of Science, 2023. 36(4): p. 1506-1520. DOI: https://doi.org/10.35378/gujs.993763
[5] Naheed, N., et al., Importance of features selection, attributes selection, challenges and future directions for medical imaging data: a review. Computer Modeling in Engineering & Sciences, 2020. 125(1): p. 314-344. DOI: https://doi.org/10.32604/cmes.2020.011380
[6] Bashir, S., et al., A novel feature selection method for classification of medical data using filters, wrappers, and embedded approaches. Complexity, 2022. 2022(1): p. 8190814. DOI: https://doi.org/10.1155/2022/8190814
[7] Chen, C.W., et al., Ensemble feature selection in medical datasets: Combining filter, wrapper, and embedded feature selection results. expert systems, 2020. 37(5): p. e12553. DOI: https://doi.org/10.1111/exsy.12553
[8] Mallidi, S.K.R. and R.R. Ramisetty, Optimizing intrusion detection for IoT: a systematic review of machine learning and deep learning approaches with feature selection and data balancing. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2025. 15(2): p. e70008. DOI: https://doi.org/10.1002/widm.70008
[9] Sadeghian, Z., et al., A review of feature selection methods based on meta-heuristic algorithms. Journal of Experimental & Theoretical Artificial Intelligence, 2025. 37(1): p. 1-51. DOI: https://doi.org/10.1080/0952813X.2023.2183267
[10] Cherrington, M., et al. Feature selection: filter methods performance challenges. in 2019 International Conference on Computer and Information Sciences (ICCIS). 2019. IEEE. DOI: https://doi.org/10.1109/ICCISci.2019.8716478
[11] Sosa-Cabrera, G., et al., Feature selection: A perspective on inter-attribute cooperation. International Journal of Data Science and Analytics, 2024. 17(2): p. 139-151. DOI: https://doi.org/10.1007/s41060-023-00439-z
[12] Theng, D. and K.K. Bhoyar, Feature selection techniques for machine learning: a survey of more than two decades of research. Knowledge and Information Systems, 2024. 66(3): p. 1575-1637. DOI: https://doi.org/10.1007/s10115-023-02010-5
[13] Natarajan, K., D. Baskaran, and S. Kamalanathan, An adaptive ensemble feature selection technique for model-agnostic diabetes prediction. Scientific Reports, 2025. 15(1): p. 6907. DOI: https://doi.org/10.1038/s41598-025-91282-8
[14] Fida, M.A.F.A., T. Ahmad, and M. Ntahobari. Variance threshold as early screening to Boruta feature selection for intrusion detection system. in 2021 13th International Conference on Information & Communication Technology and System (ICTS). 2021. IEEE.
[15] Abdo, A., R. Mostafa, and L. Abdel-Hamid, An optimized hybrid approach for feature selection based on chi-square and particle swarm optimization algorithms. Data, 2024. 9(2): p. 20. DOI: https://doi.org/10.3390/data9020020
[16] Haghighat, H., Machine Learning Techniques and Chi-square Feature Selection for Diagnostic Classification Model of Autism Spectrum Disorder Using fMRI Data.
[17] Nasiri, H. and S.A. Alavi, A Novel Framework Based on Deep Learning and ANOVA Feature Selection Method for Diagnosis of COVID‐19 Cases from Chest X‐Ray Images. Computational intelligence and neuroscience, 2022. 2022(1): p. 4694567. DOI: https://doi.org/10.1155/2022/4694567
[18] Ahmad, B., J. Chen, and H. Chen, Feature selection strategies for optimized heart disease diagnosis using ML and DL models. arXiv preprint arXiv:2503.16577, 2025.
[19] Zhou, H., X. Wang, and R. Zhu, Feature selection based on mutual information with correlation coefficient. Applied intelligence, 2022. 52(5): p. 5457-5474. DOI: https://doi.org/10.1007/s10489-021-02524-x
[20] Jiang, J., X. Zhang, and Z. Yuan, Feature selection for classification with Spearman’s rank correlation coefficient-based self-information in divergence-based fuzzy rough sets. Expert Systems with Applications, 2024. 249: p. 123633. DOI: https://doi.org/10.1016/j.eswa.2024.123633
[21] Efe, Y. and L. Demir, The impact of feature selection models on the accuracy of tree-based classification algorithms: Heart disease case. Procedia Computer Science, 2025. 253: p. 757-764. DOI: https://doi.org/10.1016/j.procs.2025.01.137
[22] Rehman, M., R. Kalakoti, and H. Bahşi. Comprehensive feature selection for machine learning-based intrusion detection in healthcare IoMT networks. in 11th International Conference on Information Systems Security and Privacy. 2025. SCITEPRESS-Science and Technology Publications. DOI: https://doi.org/10.5220/0013313600003899
[23] Behera, S.R., B. Pati, and S. Parida, A Meta-heuristic Hybrid Wrapper Method based on Feature Selection for Classification of Biological Samples. Computación y Sistemas, 2025. 29(2). DOI: https://doi.org/10.13053/cys-29-2-5195
[24] Baranauskas, J.A. and M.C. Monard, Experimental feature selection using the wrapper approach. WIT Transactions on Information and Communication Technologies, 2025. 22.
[25] Liyew, C.M., et al., A review of feature selection methods for actual evapotranspiration prediction. Artificial Intelligence Review, 2025. 58(10): p. 292. DOI: https://doi.org/10.1007/s10462-025-11298-4
[26] Ali, M.Z., et al., Advances and challenges in feature selection methods: a comprehensive review. J. Artif. Intell. Metaheuristics, 2024. 7(1): p. 67-77.
[27] Shakir, H.M., A.K. Oleiwi, and H.A. Mejbel Al-Madhee, A New Structure based on Filter-Wrapper Feature Selection and Optimized Hybrid Classification for Coronary Artery Disease Diagnosis. International Journal of Intelligent Engineering & Systems, 2025. 18(7). DOI: https://doi.org/10.22266/ijies2025.0831.44
[28] Okwuosa, C.N. and J.-W. Hur. Enhancing Induction Motor Reliability Through Advanced Feature Selection and Diagnostic Models in Low-Load Conditions. in 2025 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). 2025. IEEE. DOI: https://doi.org/10.1109/ICAIIC64266.2025.10920696
[29] Soladoye, A.A., et al., Enhancing Alzheimer's Disease Prediction Using Random Forest: A Novel Framework Combining Backward Feature Elimination and Ant Colony Optimization. Current Research in Translational Medicine, 2025: p. 103526. DOI: https://doi.org/10.1016/j.retram.2025.103526
[30] Bulut, O., et al., Benchmarking Variants of Recursive Feature Elimination: Insights from Predictive Tasks in Education and Healthcare. Information, 2025. 16(6): p. 476. DOI: https://doi.org/10.3390/info16060476
[31] Priyatno, A.M. and T. Widiyaningtyas, A systematic literature review: recursive feature elimination algorithms. JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer), 2024. 9(2): p. 196-207. DOI: https://doi.org/10.33480/jitk.v9i2.5015
[32] Saad, A.H., N.A.A. Hamzah, and W.M.D.W. Zaki. Exhaustive Feature Selection Using Wrapper method for Artery-Vein Classification in Retinal Fundus Image. in 2025 21st IEEE International Colloquium on Signal Processing & Its Applications (CSPA). 2025. IEEE. DOI: https://doi.org/10.1109/CSPA64953.2025.10933332
[33] Ali, W. and F. Saeed, Hybrid filter and genetic algorithm-based feature selection for improving cancer classification in high-dimensional microarray data. Processes, 2023. 11(2): p. 562. DOI: https://doi.org/10.3390/pr11020562
[34] Fang, Y., et al., A feature selection based on genetic algorithm for intrusion detection of industrial control systems. Computers & Security, 2024. 139: p. 103675. DOI: https://doi.org/10.1016/j.cose.2023.103675
[35] Carrasco, M., et al., Embedded feature selection for robust probability learning machines. Pattern Recognition, 2025. 159: p. 111157. DOI: https://doi.org/10.1016/j.patcog.2024.111157
[36] Lei, C., et al., Comparisons of filter, wrapper, and embedded feature selection for rockfall susceptibility prediction and mapping. Natural Hazards, 2025. 121(2): p. 1911-1943. DOI: https://doi.org/10.1007/s11069-024-06878-6
[37] Vallabhaneni, P., et al., Comparative Study of Feature Selection Algorithms for Heart Disease Prediction, in Real-World Applications and Implementations of IoT. 2025, Springer. p. 231-244. DOI: https://doi.org/10.1007/978-981-97-8627-5_15
[38] Royhan, W. and A. Amalia. Feature Selection Using Ensemble Lasso Regression, Random Forest and Recursive Feature Elimination Methods in Breast Cancer Classification. in 2025 International Conference on Computer Sciences, Engineering, and Technology Innovation (ICoCSETI). 2025. IEEE. DOI: https://doi.org/10.1109/ICoCSETI63724.2025.11020560
[39] Awasthi, N. and P.R. Gautam, Android ransomware network traffic detection using decision tree and L1 LASSO regularization feature selection, in Intelligent Computing and Communication Techniques. 2025, CRC Press. p. 729-737. DOI: https://doi.org/10.1201/9781003530190-104
[40] Hamada, M., et al. Evaluation of Recursive Feature Elimination and LASSO Regularization-based optimized feature selection approaches for cervical cancer prediction. in 2021 IEEE 14th International symposium on embedded multicore/many-core systems-on-chip (MCSoC). 2021. IEEE. DOI: https://doi.org/10.1109/MCSoC51149.2021.00056
[41] Amini, F. and G. Hu, A two-layer feature selection method using Genetic Algorithm and Elastic Net. Expert Systems with Applications, 2021. 166: p. 114072. DOI: https://doi.org/10.1016/j.eswa.2020.114072
[42] Demir, S. and E.K. Sahin, An investigation of feature selection methods for soil liquefaction prediction based on tree-based ensemble algorithms using AdaBoost, gradient boosting, and XGBoost. Neural Computing and Applications, 2023. 35(4): p. 3173-3190. DOI: https://doi.org/10.1007/s00521-022-07856-4
[43] Yıldız, A.Y. and A. Kalayci, Gradient boosting decision trees on medical diagnosis over tabular data. arXiv preprint arXiv:2410.03705, 2024. DOI: https://doi.org/10.1109/ICAD65464.2025.11114069
[44] Noroozi, Z., A. Orooji, and L. Erfannia, Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction. Scientific reports, 2023. 13(1): p. 22588. DOI: https://doi.org/10.1038/s41598-023-49962-w
[45] Ahmad, H.F., et al., Investigating health-related features and their impact on the prediction of diabetes using machine learning. Applied Sciences, 2021. 11(3): p. 1173. DOI: https://doi.org/10.3390/app11031173
[46] Islam, M.A., et al., Precision healthcare: A deep dive into machine learning algorithms and feature selection strategies for accurate heart disease prediction. Computers in Biology and Medicine, 2024. 176: p. 108432. DOI: https://doi.org/10.1016/j.compbiomed.2024.108432
[47] Pathan, M.S., et al., Analyzing the impact of feature selection on the accuracy of heart disease prediction. Healthcare Analytics, 2022. 2: p. 100060. DOI: https://doi.org/10.1016/j.health.2022.100060
[48] Olawade, D.B., et al., Comparative analysis of machine learning models for coronary artery disease prediction with optimized feature selection. International Journal of Cardiology, 2025: p. 133443. DOI: https://doi.org/10.1016/j.ijcard.2025.133443
[49] Oreski, D., S. Oreski, and B. Klicek, Effects of dataset characteristics on the performance of feature selection techniques. Applied Soft Computing, 2017. 52: p. 109-119. DOI: https://doi.org/10.1016/j.asoc.2016.12.023
[50] Shilaskar, S. and A. Ghatol, Feature selection for medical diagnosis: Evaluation for cardiovascular diseases. Expert systems with applications, 2013. 40(10): p. 4146-4153. DOI: https://doi.org/10.1016/j.eswa.2013.01.032
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Al-Noor Journal of Engineering Management and Computer Science

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.





