Metaheuristic Algorithms for Optimization and Feature Selection in Cloud Data Classification Using Convolutional Neural Network

Document Type : Research Paper

Authors

1 M.Tech., Assistant Professor, Department of Information Technology, Ajay Kumar Garg Engineering College, Ghaziabad, India.

2 Associate Professor, Department of Computer Science and Engineering National Institute of Technology, Agartala, India.

Abstract

Cloud Computing has drastically simplified the management of IT resources by introducing the concept of resource pooling. It has led to a tremendous improvement in infrastructure planning. The major goals of cloud computing include maximization of computing resources with minimization of cost. But the truth is that everything has a price and cloud computing is no different. With Cloud computing there comes a number of security concerns which need to be addressed. Cloud forensics plays a vital role to address the security issues related to cloud computing by identifying, collecting and studying digital evidence in cloud environment.
The aim of the research paper is to explore the concept of cloud forensic by applying optimization for feature selection before classification of data on cloud side. The data is classified as malicious and non-malicious using convolutional neural network. The proposed system makes a comparison of models with and without feature selection algorithms before applying the data to CNN. A comparison of different metaheuristics algorithms- Particle Swarm Optimization, Shuffled Frog Leap Optimization and Fire fly algorithm for feature optimization is done based on convergence rate and efficiency.

Keywords


Introduction

These days Cloud computing is the one of the most widely known and used computing paradigms used by a huge number of users all around the world. Cloud computing is one of the rising wireless storage technologies, which may suffer from attacks like other technology. An effort has made in the field of development of computing technologies by scholars and companies. There has been a vast development in the fields of cloud computing, edge computing, fog computing, mobile computing, and the internet of things (IoT) for the execution of several tasks simultaneously with low cost and time. Cloud computing offers many benefits to the users in terms of services like on demand services, elasticity. pay per use etc. but at the same time it is also accompanied with several security threats or issues (Mishra et al., 2021). From years, researchers have been working around cloud computing open security issues. The major challenge has always been the to understand the aspects that are important to design a secure system in cloud environment. Though the cloud is efficient and elastic by nature in offering services, the host system in the cloud is highly prone to intrusions. Therefore, detection of Data intrusion that is stored in cloud is an important aspect which requires attention of researchers and industry. (Dutta et al., 2018) (Singh et al., 2018).

This paper focuses on use of feature selection algorithms before classification of cloud forensic data more specifically cloud hosts as malicious and non-malicious. The host system is the major component in the cloud that stores all the information gathered from the users into virtual servers as well as perform the execution tasks for the user applications. Therefore, the host system is highly prone to intrusions (Martini & Choo, 2012). For classification we use Convolutional Neural Network. For selection of optimal features from datasets we have used particle swarm optimization (PSO), firefly optimization (FFA), shuffled frog leaping algorithm (SFLA) and then compared the performance of classification for these algorithms as well as without applying any feature selection algorithm. The work is implemented on MATLAB platform and analysis is performed on two different datasets, Microsoft Malware prediction database, Host list from Github database.

Literature Review 

This section gives the recent methods that are used for optimization and feature selection for machine learning algorithms.

(Oliveira et al., 2015) proposed that optimization is a part of Machine learning (ML). Almost all ML problems get reduced to optimization problems. This work discusses and gives a glimpse of various metaheuristic optimization techniques which are used in recent times in machine learning. The classification criteria can be used for the meta-heuristics, in terms of the features that follow in the research, memory feature, type of neighbor holding used or the number of current solutions made from one iteration to the next.

(Sun et al., 2019) provides an idea about principles and progresses of most commonly used optimization methods from the perspective of machine learning. The various approaches discussed are of great significance, which can offer guidance for developments in the area of optimization and machine learning research.

(Emary et al., 2015) in this paper uses Firefly algorithm for feature selection. The proposed methodology is tested on eighteen data sets. The results discussed in the paper show that firefly algorithm is better for feature search as compared to other methods like particle swarm optimization (PSO) and genetic algorithm (GA). 

A feature selection method on basis of Firefly algorithm has been suggested by (Selvi et al., 2017) to improve big data analysis. The suggested technique was tested on a huge twitter data set. The effectiveness of proposed system was proven.

(Abdelsalam et al., 2018) in his paper discusses a study on multi-objective (Zhang et al., 2020) particle swarm optimization (PSO) for feature selection. Two PSO based feature selection algorithms have been studied. In first algorithm, the idea of non-dominated sorting into PSO for feature selection problems has been presented. In second algorithm, the ideas of crowding, mutation and dominance to PSO to search for the Pareto front solutions has been discussed. Finally, a comparison is made for two algorithms on basis of conventional feature selection methods, a single objective feature selection method, a two-stage feature selection algorithm and three well-known evolutionary multi-objective algorithms on twelve benchmark datasets.

(Xue et al., 2012) in this paper presents a new approach for designing fuzzy rule-based classifiers. For feature selection of classifiers, a continuous shuffled frog-leaping algorithm is applied. On a set of constructed classifiers, the optimal classifier is selected in terms of the accuracy and the number of features used, using the statistical Akaike informational criterion.

Methodology

In this paper we have used Convolutional Neural Network (Kilincer et al., 2021) for the purpose of classification of data from two datasets: Microsoft malware prediction database and Host list database from Github. We have done the classification without and with applying feature selection algorithms. Feature selection algorithms applied in the work are: Particle Swarm Optimization (PSO), Firefly algorithm (FFA and Shuffled Frog Leap Optimization (SFLA). The performance of the models is evaluated in terms of F-measure, Recall, Precision, and accuracy. (Figure 1) shows the overall proposed methodology for feature selection and classification.

Figure 1. Workflow of the proposed methodology

 

Feature Selection Algorithms

Firefly algorithm for feature selection

Step1: Initializing the number of features to detect the malfunction in cloud computing.

Step2: Initializing the features performance by considering the equation . Define the coefficient  and value λ, which denotes the attraction towards best feature in equation .

Step3: Evaluate the effectiveness of the features by equation . The comparison between features is obtained from equation .

The gravitation towards best feature (λ) is given by,

. Where λ0=λ at 0,  is coefficient. The flow of feature ‘i’ towards more potency feature ’j’ is determined by . Where ‘i' is the best feature, 'a' is a difference in characteristics of various features, λ denotes attraction towards best features, ‘i’ denotes features and‘t’ denotes iterations.  are random features which ranges from -0.5 to 0.5,  is the randomization parameter. Whereas representing the characteristics difference between features. Between two features ’j’ is better than ‘i’ then feature ‘i’ is omitted.

Step 4: If i=1(for all the features) and j=1(for all the features which are going to compare) then compare both whether ‘j’ has better performance than ‘i’ then hold feature ‘j’ for next level.

Step 5: Increase the number of features for execution with the new best features.

Step 6: Eliminate the unwanted features which has unique null characteristics. In every step, evaluate the new feature with the existing best feature and maintain the resulted feature for the next step.

Step 7: Rank the best features based on the final step of execution and hold the current best features or otherwise repeat the process.

Step 8: The fitness function is updated by equation . The characteristics variation between any two features is given by,

 

The aim of this algorithm is to detect the malicious function in cloud computing by selecting the appropriate feature

The unwanted features are eliminated based on the performance of each feature in the detection of malware in cloud computing. This study offers new fitness function which is given by,

. Where , is a total number of features, is a number of features left out.

Particle Swarm Optimization

PSO in feature selection of cloud computing offers the best position to feature based on the existing flow position of the feature. Each feature has different flow characteristics thus PSO is useful to extract the best performed features among group of features. The best feature is selected in the set of features given in the database as and the range of the features is depending upon function . Where x is feature and is fitness function which is given by

                                                                                      (1)                                         

The position movement of each feature to get best feature is given by,

                                                        (2)

                                                                                                                     (3)

Where V is flow of feature, is coefficient, is current position of feature,  is pre-best position of feature, is new updated position of features, are random parameters with [0, 1] range,  are accelerating constant, Equation (2) denotes newly updated feature equation in which a previous number of features are multiplied with 'α'. If α=1, then the feature's selection is fully influenced by the previous number of features, to reduce this 'α' should be in

The first step in the process of detecting malfunction in cloud computing using particle swarm optimization algorithm, is initializing the number of features to identify the malfunction. After the first step of execution, update the flow and position of features. If the difference in position of flow of features is increased, then the value of (Obest_ab-X_ab^t) get increased. Hence this term increases the attraction towards best results which is the best feature. Remove the unwanted features which have unique characteristics in null value. If the current results are better than the previous one, then assign all the new features to detect the malware. Increase the number of iterations until getting bet feature.

Shuffled Frog Leaping Optimization

Step 1: Initializing the total number of features that are going to use in the detection of malware. Number of features is denoted in equation . The feature is represented by for d-dimensional problems. Then compute the performance value of f(i) for each J(t).

Step 2: Examine the characteristics of each feature performance by equation  and grouping all the features. The step size is denoted by equation  .

Step 3: Sort the features in descending order according to their performance. For each step of execution, compare the best feature in a group with the worst feature.

Step 4: If the results from the above process are satisfied, then renew the worst feature and shuffle all the features, then check whether the resulted feature is best, or otherwise go to the next step. 

Step 4: The sub-class is used to concentrate more on best features, and to eliminate the unwanted poor performed feature. On sub-class process, the best performed features are considered. The unwanted feature is replaced by introducing a random feature (R).

Step 5: The variation in each feature is obtained from equation . Categories the best one and worst performing feature and hold the current best feature for the next step.

Step 6: Evaluate the current best features with the group of best features, then evaluate the individual features in randomly. Shuffle all the features then validate the best features.

Step 7: If the resulted features are having best performance, then stop the process or otherwise repeat the process.

Step 8: The newly determined best features were obtained by using equation  .

Which denotes feature with best performance has the highest probability,  is considered for the subclass, worst performing has the lowest probability . In subclass features are selected randomly from a total number of frogs (n) to form a subclass array of features C and record the performance. In Z-dimensional problem, let is the best feature and  is worst feature and the global best one is  the feature selection is renewed .

                                       (4)

Where ‘int (F)’ is roundness of F. ‘r’ is random number range of (0-1) and denotes maximum step size allocation. The resulted feature is obtained by the following expression,

                                                                                                                      (5)

Examine the resulted feature , if getting a best result then replace  by . Random feature ‘R’ is generated in the feasible range to replace the features which has undesirable performance and compute f(R) set and . Whether and then increase both by one. The iteration would not end until the satisfied output was obtained. After all the searches were completed within the features, then the halt conditions were implemented by the following criteria such as; while the optimum solution is obtained or there is no improvement in consecutive step of process and predefined number of iterations were executed.

Convolutional Neural Network

We apply Convolutional Neural Network (CNN) for classifying the data (Albawi et al., 2017) before and after applying three feature selection algorithms: FFA, PSO and SFLA.

 

Figure 2. Structure of CNN

CNN is used with two convolutional and pooling layer, two fully connected layer along with that input and output layer is used as shown in (Figure 2). The activation function used is ReLU for making the negatives to nullify and at the end SoftMax function is applied which will map the values to 0 and 1. Gradient descent method is used to minimize the error.

 

Results

The CNN model has been evaluated without applying any feature selection algorithm and then with three feature selection algorithms: FFA, PSO and SFLA.

(i) Evaluations of CNN without feature selection

The CNN model has been evaluated with the datasets separately without using the feature selection algorithms. This evaluation is done to identify the effectiveness of the feature selection algorithms in malicious node detection. The results of the evaluations on different datasets are presented (Figure 3) and (Figure 4) below:

 

Figure 3. Performance of CNN for the Microsoft malware prediction database without feature selection

 

 

Figure 4. Performance of CNN for the host list database without feature selection

 

 

(ii) Evaluations of CNN with feature selection

All those extracted features are given to the CNN based classifier for the accurate prediction of malicious host on the cloud. The performance parameters like true positive rate (TPR) as well as false positive rates (FPR) are given in (Figure 5) and (Figure 6), malicious host count is shown in (Figure 7), f-measure in (Figure 8), precision in (Figure 9) and recall values are shown in (Figure 10)

 

Figure 5. false positive rate

 

 

Figure 6. true positive rate

 

 

 

Figure 7. Malicious host count for a) Microsoft malware prediction database b) Host list database from GitHub

 

 

Figure 8. F-Measure for Microsoft and Github database

 

Figure 9. Precision for Microsoft and GitHub database

 

 

Figure 10. Recall for Microsoft and GitHub database

Table 1. Accuracy of Malware prediction methods

Algorithm

Accuracy for Microsoft database

Accuracy for GitHub database

FFA

99.84

99.09

PSO

99.76

98.18

SFLA

99.64

96.67

 

(Table 1) gives the accuracy of the malware prediction system for the three various algorithms. From the tabulation, the FFA method has given the best result for the malicious detection in the cloud computing when used with the CNN classifier

Conclusion

In this paper, the performance of the various optimization algorithm for the feature selection phase of the malicious detection in cloud environment is presented. The feature selection process of the FFA, PSO and the SFLA is studied, from that it is proved that the FFA algorithm has shown better performance. Those algorithms are analyzed for malware detection by considering two sets of databases as Microsoft malware prediction and Host list from GitHub database. By having a smaller number of optimal features on both databases, the malicious behavior is well predicted by the FFA algorithm with high accuracy. Meanwhile, the number of features selected by the other two algorithm is high and have yield low accuracy in contrast to the FFA. After implementing the process on MATLAB, the accuracy of 99.84% and 99.09% is obtained for 2500 and 330 testing data from the Microsoft malware prediction and Host list from Github database respectively by the FFA method.

Conflict of interest

The authors declare no potential conflict of interest regarding the publication of this work. In addition, the ethical issues including plagiarism, informed consent, misconduct, data fabrication and, or falsification, double publication and, or submission, and redundancy have been completely witnessed by the authors.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Abdelsalam, M., Krishnan, R., Huang, Y., & Sandhu, R. (2018, July). Malware detection in cloud infrastructures using convolutional neural networks. In 2018 IEEE 11th International conference on cloud computing (CLOUD) (pp. 162-169). IEEE.
Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017, August). Understanding of a convolutional neural network. In 2017 international conference on engineering and technology (ICET) (pp. 1-6). Ieee.
Datta, S., Santra, P., Majumder, K., & De, D. (2018). An automated malicious host recognition model in cloud forensics. In Networking Communication and Data Knowledge Engineering (pp. 61-71). Springer, Singapore
Emary, E., Zawbaa, H. M., Ghany, K. K. A., Hassanien, A. E., & Parv, B. (2015, September). Firefly optimization algorithm for feature selection. In Proceedings of the 7th Balkan Conference on Informatics Conference (pp. 1-7).
Kilincer, I. F., Ertam, F., & Sengur, A. (2021). Machine learning methods for cyber security intrusion detection: Datasets and comparative study. Computer Networks188, 107840.
Martini, B., & Choo, K. K. R. (2012). An integrated conceptual digital forensic framework for cloud computing. Digital investigation9(2), 71-80.
Mishra, S., Sharma, S. K., & Alowaidi, M. A. (2021). Analysis of security issues of cloud-based web applications. Journal of Ambient Intelligence and Humanized Computing12(7), 7051-7062.
Oliveira, P., Portela, F., Santos, M. F., Abelha, A., & Machado, J. (2015). Machine Learning: an overview of optimization techniques. Recent Adv Comput Sci.
Selvi, R. S., & Valarmathi, M. L. (2017). An improved firefly heuristics for efficient feature selection and its application in big data. Biomedical Research28, S236-S241.
Singh, D. A. A. G., Priyadharshini, R., & Leavline, E. J. (2018). Cuckoo optimization-based intrusion detection system for cloud computing. International Journal of Computer Network and Information Security9(11), 42.
Sun, S., Cao, Z., Zhu, H., & Zhao, J. (2019). A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics50(8), 3668-3681.
Xue, B., Zhang, M., & Browne, W. N. (2012). Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE transactions on cybernetics43(6), 1656-1671.
Zhang, Z., Wen, J., Zhang, J., Cai, X., & Xie, L. (2020). A many objective-based feature selection model for anomaly detection in cloud environment. IEEE Access8, 60218-60231