Volume 5, No. 5, May-June 2014
International Journal of Advanced Research in Computer Science
RESEARCH PAPER
Available Online at www.ijarcs.info
© 2010-14, IJARCS All Rights Reserved 115
ISSN No. 0976-5697
Phylogenetic Tree Construction of Biological Datasets of Cyclooxygenase (COX-1) and
(COX-2) by using Cluster Analysis Based on Experimental Values
Chukka Santhaiah
Research scholar, Dept of C.S.E
S.V.U.College of Engg, S.V.University,
Tirupati, A.P, INDIA
Dr.A.Rama Mohan Reddy
Professor, Dept of C.S.E
S.V.U.College of Engg, S.V.University,
Tirupati, A.P, INDIA
Abstract: Phylogenetic trees are worn to symbolize development of associations between biological genus and organisms. The erection of
phylogenetic trees is support on the resemblance or dissimilarity of their physical or inherited features. Conventional looms of erecting
phylogenetic trees essentially focus on substantial characteristics. The current encroachment of high-throughput knowledge has lead to buildup
of enormous quantity of biological data, which in rotate amend the approach of biological studies in a mixture of approaches. This work is
mainly focus on constructing the phylogentic tree for Cyclooxygenase of COX-1 and COX-2 based on experimental values. Here to constrct the
phylogenetic tree by applying the cluster and by using JavaTree approaches on COX-1 and COX-2. These results are shown the better
evolutionary relationship among the COX biological datasets.
Keywords: Phylogenetic tree, Cyclooxygenase, Java Tree, cluster.
I. INTRODUCTION
A phylogenetic tree is a vivid demonstration of the
completion connections of genus, and the phylogenetic
reserve surrounded by the species replicate the closeness of
evolutionary relationships. Conventional erection of
phylogenetic trees was essentially based on physical
similarities and diversity. Though, the method of the
deepness has been changed because of the production of
enormous amounts of biological data. For example, high-
throughput sequencing expertise have generated genome
sequences in numerous thousand organisms. A genomic
sequence is fundamentally a thread of four dissimilar kinds
of nucleotides (A, C, G and T), with the length from
hundreds of thousands to millions. It has been extensively
time-honored that the genomic sequences are extremely
analogous for evolutionary closed organisms, but not similar
for evolutionary distant organisms. So, genomic sequences
have been broadly used for building phylogenetic trees [1-
3].
The building of phylogenetic trees by means of
genomic sequences does have a number of issues. The
genomic sequences are frequently long; therefore compare
genomic sequences from corner to corner species for
building phylogenetic trees is computationally expensive.
On the further hand, living organisms in a small position
frequently swap over their genetic materials each other, also
recognized as straight gene transfer, making it harder to
conclude evolutionary relationships based on genomic
sequences only. Additionally, present genomic sequence
likeness measurement cannot truly reveal evolutionary
relationships across the species. Thus, it is necessary to use
other data and methods to reveal true relationships [4].
In parallel to the high-throughput genome sequencing
technologies, COX data have also been generated in the past
decade. The study of using of COX data for biological
studies is also known as Cyclooxygenase. The COX data
from organisms are very informative since they can reveal
internal inflammation mechanisms. Theoretically,
evolutionary distant species should have different
inflammation activities and patterns, while closely related
species should have similar patterns. Therefore, it is
desirable to use COX data for phylogenetic exploration, or
complement the gene-based phylogenetic exploration to
some degree.
COX data have been operated and performed, and
corresponding experimental values have been built for
scientific communities. On the COX to perform different
operations by using cluster analysis. The operations are filter
data values it is helpful to eliminate the unwanted genes
from the database. Then to operate the values in cluster for
adjust the data by applying the log transform data by
selecting center genes , normalize genes and center arrays ,
normalize arrays.
The COX experimental values can be corresponding to
as bound for or unbounded graphs. The nodes in graphs can
moreover be symbolized as values that are linked by the
closed, or be signified as enzymes linked by its values. In
consequence, via the in sequence prearranged in the graphs
be able to disclose development relationships across the
species [5].
II. METHODOLOGY
In this paper, we aim to reveal phylogenetic distances
across the species using experimental values, rather than
sequence information in the graphs. We use the data of COX
experimental values. In the relation network, enzymes and
genes are represented as nodes, while the substrate and
product compounds are represented as edges. The related
structural information from the graphs was used for
computing phylogenetic distances[6].
A. Description of COX data set:
Cyclooxygenase (COX) is the enzyme that catalyzes the
oxidation and subsequent reduction of arachidonic acid to
form Prostaglandin G2 and Prostaglandin H2 (PGH2). We
collected the processed dataset for COX-1 and COX-2
experimental values of different genes from the NCBI. The