An Archive-based Steady-State Fuzzy Differential Evolutionary Algorithm for Data Clustering (ASFDEaDC)

Document Type : Special Issue: Big Data Analytics and Management in Internet of Things.

Authors

1 Department of Computer Science, National Institute of Technology, Mizoram, Aizawl, India.

2 Department of Computer Science, Vignan’s Institution of Information, Vishakhapatnam, India.

Abstract

In the current paper, we have assimilated fuzzy techniques and optimization techniques, namely differential evolution, to put forward a modern archive-based fuzzy evolutionary algorithm for multi-objective optimization using clustering. The current work account for the application of a cluster associated approach. Specific quantitative cluster validity measures, i.e., J-measure and Xie-Beni, have been referenced to carry out the appropriate partitioning. The proposed algorithm introduces a new form of strategy which attempts to benefit the feasible search domain of the algorithm by minimizing the analysis and exploration of less beneficial search scope. This clustering method yields a group of trade-off solutions on the ultimate optimal pare to front. Eventually, these solutions are united and maintained in an archive for further evaluation. The current work summarizes and organizes an archive concerned with excellent and diversified solutions in an effort to outline comprehensive non-dominated solutions. The degree of efficiency is revealed with respect to partitioning on gene expression and real-life datasets. The proposed algorithm seeks to reduce the function assessment analysis and maintains a very small working population size. The effectiveness of the proposed method is presented in comparison with some state-of-art methods.

Keywords


Alizadeh, A. A., Eisen, M. B., Davis, R. E., Ma, C., Lossos, I. S., Rosenwald, A., ...&Staudt, L. M. (2000). Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature, 403(6769), 503-511.
Alaei, H. K., Salahshoor, K., &Alaei, H. K. (2013). A new integrated on-line fuzzy clustering and segmentation methodology with adaptive PCA approach for process monitoring and fault detection and diagnosis. soft computing, 17(3), 345-362.
Agustı, L. E., Salcedo-Sanz, S., Jiménez-Fernández, S., Carro-Calvo, L., Del Ser, J., &Portilla-Figueras, J. A. (2012). A new grouping genetic algorithm for clustering problems. Expert Systems with Applications, 39(10), 9695-9703.
Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D., & Levine, A. J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences, 96(12), 6745-6750.
Bezdek, J. (1981). Pattern Recognition with Fuzzy Objective Algorithms Plenum Press New York Google Scholar.
Bandyopadhyay, S., &Maulik, U. (2002). Genetic clustering for automatic evolution of clusters and application to image classification. Pattern recognition, 35(6), 1197-1208.
Bandyopadhyay, S. (2005). Simulated annealing using a reversible jump Markov chain Monte Carlo algorithm for fuzzy clustering. IEEE Transactions on Knowledge and data Engineering, 17(4), 479-490.
Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botstein, D., Brown, P. O., &Herskowitz, I. (1998). The transcriptional program of sporulation in budding yeast. Science, 282(5389), 699-705.
Das, S., Konar, A., &Chakraborty, U. K. (2005, June). Two improved differential evolution schemes for faster global search. In Proceedings of the 7th annual conference on Genetic and evolutionary computation (pp. 991-998).
Deb, K., & Agrawal, R. B. (1995). Simulated binary crossover for continuous search space. Complex systems, 9(2), 115-148.
Deb, K., & Tiwari, S. (2008). Omni-optimizer: A generic evolutionary algorithm for single and multi-objective optimization. European Journal of Operational Research, 185(3), 1062-1087.
Deb, K., Pratap, A., Agarwal, S., &Meyarivan, T. A. M. T. (2002). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE transactions on evolutionary computation, 6(2), 182-197.
De Souto, M. C., Costa, I. G., de Araujo, D. S., Ludermir, T. B., &Schliep, A. (2008). Clustering cancer gene expression data: a comparative study. BMC bioinformatics, 9(1), 1-14.
Ester, M., Kriegel, H. P., Sander, J., &Xu, X. (1996, August). A density-based algorithm for discovering clusters in large spatial databases with noise. In Kdd (Vol. 96, No. 34, pp. 226-231).
Horta, D., De Andrade, I. C., &Campello, R. J. (2011). Evolutionary fuzzy clustering of relational data. Theoretical Computer Science, 412(42), 5854-5870.
Iyer, V. R., Eisen, M. B., Ross, D. T., Schuler, G., Moore, T., Lee, J. C., ...& Brown, P. O. (1999). The transcriptional program in the response of human fibroblasts to serum. science, 283(5398), 83-87.
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM computing surveys (CSUR), 31(3), 264-323.
Kriegel, H. P., Kröger, P., Sander, J., &Zimek, A. (2011). Density‐based clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3), 231-240.
Kruglyak, S., &Yooseph, S. (1999). Exploring expression data: identification and analysis of coexpressed genes. Genome research, 9(11), 1106-1115.
Liu, Y., Wu, X., &Shen, Y. (2011). Automatic clustering using genetic algorithms. Applied mathematics and computation, 218(4), 1267-1279.
Mezura-Montes, E., &Coello, C. A. C. (2011). Constraint-handling in nature-inspired numerical optimization: past, present and future. Swarm and Evolutionary Computation, 1(4), 173-194.
Maulik, U., &Bandyopadhyay, S. (2003). Fuzzy partitioning using a real-coded variable-length genetic algorithm for pixel classification. IEEE Transactions on geoscience and remote sensing, 41(5), 1075-1081.
Noorbehbahani, F., Mousavi, S. R., &Mirzaei, A. (2015). An incremental mixed data clustering method using a new distance measure. Soft Computing, 19(3), 731-743.
Ni, Q., Pan, Q., Du, H., Cao, C., &Zhai, Y. (2015). A novel cluster head selection algorithm based on fuzzy clustering and particle swarm optimization. IEEE/ACM transactions on computational biology and bioinformatics, 14(1), 76-84.
Pal, N. R., &Bezdek, J. C. (1995). On cluster validity for the fuzzy c-means model. IEEE Transactions on Fuzzy systems, 3(3), 370-379.
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20, 53-65.
Rezaee, M. R., Lelieveldt, B. P., &Reiber, J. H. (1998). A new cluster validity index for the fuzzy c-mean. Pattern recognition letters, 19(3-4), 237-246.
Ravi, V., Aggarwal, N., & Chauhan, N. (2010, December). Differential evolution based fuzzy clustering. In International Conference on Swarm, Evolutionary, and Memetic Computing (pp. 38-45). Springer, Berlin, Heidelberg.
Sheng, W., Swift, S., Zhang, L., & Liu, X. (2005). A weighted sum validity function for clustering with a hybrid niching genetic algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 35(6), 1156-1167.
Saha, I., Plewczynski, D., Maulik, U., &Bandyopadhyay, S. (2012). Improved differential evolution for microarray analysis. International journal of data mining and bioinformatics, 6(1), 86-103.
Saha, S., &Bandyopadhyay, S. (2009). A new point symmetry based fuzzy genetic clustering technique for automatic evolution of clusters. Information Sciences, 179(19), 3230-3246.
Saha, S., Ekbal, A., Gupta, K., &Bandyopadhyay, S. (2013). Gene expression data clustering using a multiobjective symmetry based clustering technique. Computers in biology and medicine, 43(11), 1965-1977.
Saha, S., Das, R., &Pakray, P. (2018). Aggregation of multi-objective fuzzy symmetry-based clustering techniques for improving gene and cancer classification. Soft Computing, 22(18), 5935-5954.
Tou, J. T., & Gonzalez, R. C. (1974). Pattern recognition principles Addison-Wesley Reading.
Tvrdik, J., &Křivý, I. (2015). Hybrid differential evolution algorithm for optimal clustering. Applied Soft Computing, 35, 502-512.
Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., ...&Golub, T. R. (1999). Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proceedings of the National Academy of Sciences, 96(6), 2907-2912.
Xie, X. L., &Beni, G. (1991). A validity measure for fuzzy clustering. IEEE Transactions on pattern analysis and machine intelligence, 13(8), 841-847.
Yang, X. S., & Deb, S. (2014). Cuckoo search: recent advances and applications. Neural Computing and Applications, 24(1), 169-174.
Yue, S., Wang, J., Wang, J., &Bao, X. (2016). A new validity index for evaluating the clustering results by partitional clustering algorithms. Soft Computing, 20(3), 1127-1138.
Wen, X., Fuhrman, S., Michaels, G. S., Carr, D. B., Smith, S., Barker, J. L., & Somogyi, R. (1998). Large-scale temporal gene expression mapping of central nervous system development. Proceedings of the National Academy of Sciences, 95(1), 334-339.