BioSystems 142 (2016) 32–42
Contents lists available at ScienceDirect
BioSystems
journal homepage: www.elsevier.com/locate/biosystems
A new approach to the automatic identification of organism evolution
using neural networks
Andrzej Kasperski
a,∗
, Renata Kasperska
b
a
Faculty of Biological Sciences, Department of Biotechnology, University of Zielona Gora, ul. Szafrana 1, 65-516 Zielona Gora, Poland
b
Institute of Occupational Safety Engineering and Work Science, University of Zielona Gora, ul. Szafrana 4, 65-516 Zielona Gora, Poland
a r t i c l e i n f o
Article history:
Received 30 September 2015
Received in revised form 20 January 2016
Accepted 8 March 2016
Available online 11 March 2016
Keywords:
Computational biology
Evolution
Neural network
Phylogenetics
Programming
a b s t r a c t
Automatic identification of organism evolution still remains a challenging task, which is especially exit-
ing, when the evolution of human is considered. The main aim of this work is to present a new idea to
allow organism evolution analysis using neural networks. Here we show that it is possible to identify
evolution of any organisms in a fully automatic way using the designed EvolutionXXI program, which con-
tains implemented neural network. The neural network has been taught using cytochrome b sequences
of selected organisms. Then, analyses have been carried out for the various exemplary organisms in order
to demonstrate capabilities of the EvolutionXXI program. It is shown that the presented idea allows sup-
porting existing hypotheses, concerning evolutionary relationships between selected organisms, among
others, Sirenia and elephants, hippopotami and whales, scorpions and spiders, dolphins and whales.
Moreover, primate (including human), tree shrew and yeast evolution has been reconstructed.
© 2016 Elsevier Ireland Ltd. All rights reserved.
1. Introduction
Fully automated identification of organism evolution can be
considered as a dream for researchers and sometimes, taking into
account the complexity of this task, this aim can be treated as the
stuff of science fiction (MacLeod, 2007). In the analysis of organ-
ism evolution and their genetic variability, the methods based on
Neighbor Joining (NJ), Maximum Parsimony (MP), Maximum Like-
lihood (ML), Bayesian Inference (BI), supported by, for example,
the dot matrix method, are usually used (Finstermeier et al., 2013;
Kasperski and Kasperska, 2012, 2014). During these analyses, the
number of generated phylogenetics trees which should be consid-
ered depends substantially on the number of analyzed organisms.
Theoretically, establishing a real conclusion requires analysis of
each of the possible trees. This task can be impossible to perform for
a larger number of taxa, for example, the number of possible rooted
trees for 50 taxa is bigger than the number of atoms in the universe.
For this reason, the reconstruction of the real organism evolution is
often impossible when trying to determine the best phylogenetics
trees. It makes it necessary to seek new methods, which will allow
for more reliable determination of organism evolution and their
∗
Corresponding author.
E-mail address: A.Kasperski@wnb.uz.zgora.pl (A. Kasperski).
genetic variability. One of the computational tools, that can be used
in solving complex real-world problems, are artificial neural net-
works (ANNs) (Basheer and Hajmeer, 2000). Neural computation
can be used in various fields, due to nonlinearity, high parallelism,
robustness, fault and failure tolerance, learning, ability to handle
imprecise and fuzzy information, and their capability to general-
ize (Jain et al., 1996). ANN, as a programming method based on
a mathematical approximation of the functioning of human brain
cells, can be seen as a set of interconnected nodes implementing
a mapping function from an input space to one of several out-
put categories. By possibility of a learning and outcome prediction,
ANNs can replace traditional statistical techniques in modeling and
classification of selected problems (Ahmed, 2005; Hannachi et al.,
2003). In biology, ANNs are considered as holding great promise
in helping with advanced understanding of biological phenom-
ena and biosystems. For example, the ability of neural networks
to learn complex functions from large amounts of data without
the need for predetermined models makes them a good tool for a
protein structure prediction. ANNs can also support the acquiring
of accurate knowledge of quantitative structure-activity relation-
ship (Jalali-Heravi et al., 2011). Moreover, neural networks can be
applied to: pattern recognition of DNA, RNA, gene identification,
sequence classification, analysis of electron microscopy images of
biological macromolecules, prediction of microbial growth, iden-
tification of microorganisms and molecules, interpreting pyrolysis
http://dx.doi.org/10.1016/j.biosystems.2016.03.005
0303-2647/© 2016 Elsevier Ireland Ltd. All rights reserved.