A Hybrid Approach to Feature Extraction and Information Gain-Based Reduction for Image Classification

Document Type : Research Paper

Authors

1 Department of Computer Sc. & Engineering, Graphic Era University, Dehradun, India.

2 Prof., Department of Computer Sc. & Engineering, Graphic Era University, Dehradun, India.

10.22059/jitm.2025.102918

Abstract

Image classification is a significant process in the field of computer science. It has applications in every field, such as spam detection in emails, medical diagnosis, image recognition, sentiment analysis, object detection, weather forecasting, pattern recognition, and security. Image classification deals with the grouping of images based on labels or characteristics. Feature extraction, feature selection, feature reduction, and classification are the main steps used to classify images. A medicinal and non-medicinal flowers data set is prepared by clicking images for the study. Methodology is used to achieve satisfactory classification results on the seeds, Wisconsin Diagnostic Breast Cancer, Heart Failure Clinical Records, and Wisconsin Prognostic Breast Cancer data sets, which are taken from the University of California, Irvine (UCI) repository. The proposed methodology suggests an efficient feature extraction and selection approach for data sets under consideration. An information gain-based genetic algorithm is used for feature reduction. It is performed on the extracted features to retrieve an optimized feature set. Fitness of the features is evaluated to choose the most relevant features. A neural network is used to classify the obtained feature subset. Better classification results are attained with the help of feature extraction and feature reduction.

Keywords


Agrawal, P., Abutarboush, H. F., Ganesh, T., & Mohamed, A. W. (2021). Metaheuristic algorithms on feature selection: A survey of one decade of research (2009-2019). Ieee Access9, 26766-26791.
Agrawal, K., & Bhatnagar, C. (2023). F-mim: Feature-based masking iterative method to generate the adversarial images against the face recognition systems. Journal of Information Technology Management15(Special Issue: EIntelligent and Security for Communication, Computing Application (ISCCA-2022)), 80-93.
Al-Tashi, Q., Abdulkadir, S. J., Rais, H. M., Mirjalili, S., & Alhussian, H. (2020). Approaches to multi-objective feature selection: a systematic literature review. IEEE Access8, 125076-125096.
Battiti, R. (1994). Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on Neural Networks5(4), 537-550.
Caldeira, M., Martins, P., Costa, R. L. C., & Furtado, P. (2020). Image classification benchmark (ICB). Expert Systems with Applications142, 112998.
Charytanowicz, M., Niewczas,  J., Kulczycki,  P.,  Kowalski, P., & Lukasik, S., (2010).  “Seeds [Dataset]”. UCI Machine Learning Repository. https://doi.org/10.24432/C5H30K.
Dhal, P., & Azad, C. (2022). A comprehensive survey on feature selection in the various fields of machine learning. Applied Intelligence52(4), 4543-4581.
Fan, F. L., Xiong, J., Li, M., & Wang, G. (2021). On interpretability of artificial neural networks: A survey. IEEE Transactions on Radiation and Plasma Medical Sciences5(6), 741-760.
Fatima, M., & Pasha, M. (2017). Survey of machine learning algorithms for disease diagnostic. Journal of Intelligent Learning Systems and Applications9(01), 1-16.
Gambella, C., Ghaddar, B., & Naoum-Sawaya, J. (2021). Optimization problems for machine learning: A survey. European Journal of Operational Research290(3), 807-828.        
Heart Dataset, (2020). Heart Failure Clinical Records [Dataset], UCI Machine Learning Repository. https://doi.org/10.24432/C5Z89R.
Jha, K. K., & Dutta, H. S. (2019). Mutual information based hybrid model and deep learning for acute lymphocytic leukemia detection in single cell blood smear images. Computer methods and programs in biomedicine179, 104987.
Katoch, S., Chauhan, S. S., & Kumar, V., (2021). A review on genetic algorithm: past, present, and future. Multimedia tools and applications”, 80, 8091-8126.         
Khan, A. H., Sarkar, S. S., Mali, K., & Sarkar, R. (2022). A genetic algorithm based feature selection approach for microstructural image classification. Experimental Techniques, 1-13.
Kwak, N., & Choi, C. H. (2002). Input feature selection for classification problems. IEEE transactions on neural networks13(1), 143-159.
Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2021). A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems33(12), 6999-7019.
Panyadee, P., Balslev, H., Wangpakapattanawong, P., & Inta, A. (2019). Medicinal plants in homegardens of four ethnic groups in Thailand. Journal of ethnopharmacology239, 111927.
Sachar, S., & Kumar, A. (2021). Survey of feature extraction and classification techniques to identify plant through leaves. Expert Systems with Applications167, 114181.
Saranya, G., & Pravin, A. (2021). Feature selection techniques for disease diagnosis system: A survey. In Artificial Intelligence Techniques for Advanced Computing Applications: Proceedings of ICACT 2020 (pp. 249-258). Springer Singapore.
Singh, M. K., & Kumar, A. (2023). Cucumber leaf disease detection and classification using a deep convolutional neural network. Journal of Information Technology Management15(Special Issue: EIntelligent and Security for Communication, Computing Application (ISCCA-2022)), 94-110.
Singh, V., & Misra, A. K. (2015). Detection of unhealthy region of plant leaves using image processing and genetic algorithm. In 2015 International Conference on Advances in Computer Engineering and Applications (pp. 1028-1032). IEEE.
Sun, Y., Xue, B., Zhang, M., & Yen, G. G. (2019). Evolving deep convolutional neural networks for image classification. IEEE Transactions on Evolutionary Computation24(2), 394-407.
Tali, B. A., Khuroo, A. A., Ganie, A. H., & Nawchoo, I. A. (2019). Diversity, distribution and traditional uses of medicinal plants in Jammu and Kashmir (J&K) state of Indian Himalayas. Journal of Herbal Medicine17, 100280.
Thamilselvan, P., & Sathiaseelan, J. (2015). A comparative study of data mining algorithms for image classification. Int. J. Educ. Manage. Eng5(2), 1-9.
Thejas, G. S., Joshi, S. R., Iyengar, S. S., Sunitha, N. R., & Badrinath, P. (2019). Mini-batch normalized mutual information: A hybrid feature selection method. IEEE Access7, 116875-116885.
Wang, Y., & Wang, Z. (2019). A survey of recent work on fine-grained image classification techniques. Journal of Visual Communication and Image Representation59, 210-214.       
Wolberg, W., (1990). “Breast Cancer Wisconsin (Original) [Dataset]”. UCI Machine Learning Repository. https://doi.org/10.24432/C5HP4Z.
Xue, B., Zhang, M., Browne, W. N., & Yao, X. (2015). A survey on evolutionary computation approaches to feature selection. IEEE Transactions on Evolutionary Computation20(4), 606-626.
Zhou, H., Wang, X., & Zhu, R. (2022). Feature selection based on mutual information with correlation coefficient. Applied intelligence52(5), 5457-5474.