BioSystems 142 (2016) 32–42 Contents lists available at ScienceDirect BioSystems journal homepage: www.elsevier.com/locate/biosystems A new approach to the automatic identification of organism evolution using neural networks Andrzej Kasperski a, , Renata Kasperska b a Faculty of Biological Sciences, Department of Biotechnology, University of Zielona Gora, ul. Szafrana 1, 65-516 Zielona Gora, Poland b Institute of Occupational Safety Engineering and Work Science, University of Zielona Gora, ul. Szafrana 4, 65-516 Zielona Gora, Poland a r t i c l e i n f o Article history: Received 30 September 2015 Received in revised form 20 January 2016 Accepted 8 March 2016 Available online 11 March 2016 Keywords: Computational biology Evolution Neural network Phylogenetics Programming a b s t r a c t Automatic identification of organism evolution still remains a challenging task, which is especially exit- ing, when the evolution of human is considered. The main aim of this work is to present a new idea to allow organism evolution analysis using neural networks. Here we show that it is possible to identify evolution of any organisms in a fully automatic way using the designed EvolutionXXI program, which con- tains implemented neural network. The neural network has been taught using cytochrome b sequences of selected organisms. Then, analyses have been carried out for the various exemplary organisms in order to demonstrate capabilities of the EvolutionXXI program. It is shown that the presented idea allows sup- porting existing hypotheses, concerning evolutionary relationships between selected organisms, among others, Sirenia and elephants, hippopotami and whales, scorpions and spiders, dolphins and whales. Moreover, primate (including human), tree shrew and yeast evolution has been reconstructed. © 2016 Elsevier Ireland Ltd. All rights reserved. 1. Introduction Fully automated identification of organism evolution can be considered as a dream for researchers and sometimes, taking into account the complexity of this task, this aim can be treated as the stuff of science fiction (MacLeod, 2007). In the analysis of organ- ism evolution and their genetic variability, the methods based on Neighbor Joining (NJ), Maximum Parsimony (MP), Maximum Like- lihood (ML), Bayesian Inference (BI), supported by, for example, the dot matrix method, are usually used (Finstermeier et al., 2013; Kasperski and Kasperska, 2012, 2014). During these analyses, the number of generated phylogenetics trees which should be consid- ered depends substantially on the number of analyzed organisms. Theoretically, establishing a real conclusion requires analysis of each of the possible trees. This task can be impossible to perform for a larger number of taxa, for example, the number of possible rooted trees for 50 taxa is bigger than the number of atoms in the universe. For this reason, the reconstruction of the real organism evolution is often impossible when trying to determine the best phylogenetics trees. It makes it necessary to seek new methods, which will allow for more reliable determination of organism evolution and their Corresponding author. E-mail address: A.Kasperski@wnb.uz.zgora.pl (A. Kasperski). genetic variability. One of the computational tools, that can be used in solving complex real-world problems, are artificial neural net- works (ANNs) (Basheer and Hajmeer, 2000). Neural computation can be used in various fields, due to nonlinearity, high parallelism, robustness, fault and failure tolerance, learning, ability to handle imprecise and fuzzy information, and their capability to general- ize (Jain et al., 1996). ANN, as a programming method based on a mathematical approximation of the functioning of human brain cells, can be seen as a set of interconnected nodes implementing a mapping function from an input space to one of several out- put categories. By possibility of a learning and outcome prediction, ANNs can replace traditional statistical techniques in modeling and classification of selected problems (Ahmed, 2005; Hannachi et al., 2003). In biology, ANNs are considered as holding great promise in helping with advanced understanding of biological phenom- ena and biosystems. For example, the ability of neural networks to learn complex functions from large amounts of data without the need for predetermined models makes them a good tool for a protein structure prediction. ANNs can also support the acquiring of accurate knowledge of quantitative structure-activity relation- ship (Jalali-Heravi et al., 2011). Moreover, neural networks can be applied to: pattern recognition of DNA, RNA, gene identification, sequence classification, analysis of electron microscopy images of biological macromolecules, prediction of microbial growth, iden- tification of microorganisms and molecules, interpreting pyrolysis http://dx.doi.org/10.1016/j.biosystems.2016.03.005 0303-2647/© 2016 Elsevier Ireland Ltd. All rights reserved.