Inference of Pathway Decomposition Across Multiple Species Through Gene Clustering Dimitrios M. Vitsios Center for Research and Technology Hellas (CERTH ) 6th km. Charilaou – Thermi Road, Thermi 57001, Thessaloniki, Greece EMBL – European Bioinformatics Institute, Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK dvitsios@ebi.ac.uk Fotis E. Psomopoulos Center for Research and Technology Hellas (CERTH ) 6th km. Charilaou – Thermi Road, Thermi 57001, Thessaloniki, Greece fpsom@certh.gr Pericles A. Mitkas Department of Electrical and Computer Engineering Aristotle University of Thessaloniki AUTH CAMPUS, Thessaloniki 54124, Greece mitkas@eng.auth.gr In the wake of gene-oriented data analysis in large-scale bioinformatics studies, focus in research is currently shifting towards the analysis of the functional association of genes, namely the metabolic pathways in which genes participate. The goal of this paper is to attempt to identify the core genes in a specific pathway, based on a user-defined selection of genomes. To this end, a novel algorithm has been developed that uses data from the KEGG database, and through the application of the MCL clustering algorithm, identifies clusters that correspond to different “layers” of genes, either on a phylogenetic or a functional level. The algorithm’s complexity, evaluated experimentally, is presented and the results on three characteristic case studies are discussed. Keywords: Bioinformatics; metabolic pathways, clustering algorithm; phylogenetic analysis. Christos A. Ouzounis Center for Research and Technology Hellas (CERTH ) 6th km. Charilaou – Thermi Road, Thermi 57001, Thessaloniki, Greece ouzounis@certh.gr Received 22 April 2013 Accepted 14 September 2014 Published 23 February 2015 International Journal on Artificial Intelligence Tools Vol. 24, No. 1 (2015) 1540003 (27 pages) c World Scientific Publishing Company DOI: 10.1142/S0218213015400035 1540003-1