Received: December 16, 2020. Revised: January 25, 2021. 421 International Journal of Intelligent Engineering and Systems, Vol.14, No.2, 2021 DOI: 10.22266/ijies2021.0430.38 A New Hybrid Clustering Method of Binary Differential Evolution and Marine Predators Algorithm for Multi-omics Datasets Mohamed Ghoneimy 1 * Hesham A. Hassan 2 Emad Nabil 2,3 1 Faculty of Information Technology, MUST University, 6th of October City, Giza, Egypt 2 Faculty of Computers and Artificial Intelligence, Cairo University, Giza, Egypt 3 Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah, Saudi Arabia * Corresponding author’s Email: Mohamed.ghoneimy@must.edu.eg Abstract: Clustering of biological datasets proved to reveal a lot of significant insights into the medical and biological research. It is an important step towards drug design, vaccine discovery, disease diagnosis, and more. The current trend in biological and medical research is to combine more than one dataset, referred to as multi-omics, related to a specific problem, then perform the clustering or the analysis. The insights we gain for a particular biological problem or disease are double using multi-omics rather than using one dataset. It is like investigating a problem from many dimensions rather than using one dimension. On the other hand, the difficulty of clustering is increased. Another tricky problem in data clustering is determining the best number of clusters used by a clustering algorithm. Due to the big success of metaheuristics in solving the automatic clustering problems, we propose in this paper a new hybrid method that utilizes two powerful metaheuristics algorithms, the Binary Differential Evolution and Marine Predators Algorithm, to perform automatic clustering on multi-omics datasets. Our proposal's performance is investigated upon eight multi-omics datasets from TCGA, and it is compared with four recent and powerful metaheuristics. The used performance metrics are clustering quality and execution time. The experimental results show that the proposed algorithm not only outperformed its competitors in terms of clustering quality, it also only needed a third of the execution time that of its fastest competitor. Moreover, the statistical analysis shows that the obtained results are statistically significant. Accordingly, the proposed method can be considered as an efficient clustering method for multi-omics datasets. Keywords: Clustering, Automatic clustering, Differential evolution, MPA, Metaheuristics, Nature-inspired metaheuristics, Molecular-level interaction, Multi-omics. 1. Introduction The rapid development in high throughput methods produced huge data types such as DNA methylation, DNA genome sequence, and RNA expression, each of them is called omic. The analysis of multi-omics datasets is beneficial for the following reasons. First, it reduces the effect of noise on results. Seconds, using omics from different molecular aspect levels such as genomic and epigenomic can show different aspects of patients. Third, even using omics from the same level, such as mutation and copy number, can reveal the omics' different aspects [1]. It is necessary to develop new computational methods to analyze these datasets. Clustering is the process of discovering the natural grouping of records according to their similarities. Clustering is a fundamental process in analysis, and it is often used as the first process of data analysis. Clustering is essential for medical research as it is used to discover the co-regulated genes and new grouping of patients based on genetic similarity. Many clustering methods need to determine the number of clusters before starting, which can be challenging to obtain in a multi-omics problem [2, 3]. This problem can be solved using automatic clustering, which needs to determine the minimum and the maximum number of clusters. The two main automatic clustering tasks are determining the optimal number of clusters and determining the appropriate cluster for each object.