doi: 10.1111/cea.12569 Clinical & Experimental Allergy, 45, 1259–1261
RESEARCH LETTER
© 2015 John Wiley & Sons Ltd
DAAB: a manually curated database of allergy and asthma biomarkers
G. Sircar
1,
*, B. Saha
1,
*, T. Jana
2,
*, A. Dasgupta
3
, S. Gupta Bhattacharya
1
and S. Saha
2
1
Division of Plant Biology, Bose Institute, Kolkata, India,
2
Bioinformatics Center, Bose Institute, Kolkata, India and
3
Department of Medicine, BR Singh
Hospital and Centre for Medical Education and Research, Kolkata, India
Allergy and asthma have reached a pandemic dimen-
sion, and search for good biomarkers for this disease is
of considerable clinical interests [1]. The absence of
accurate biomarker for allergy has rendered diagnosis
and phenotyping of this disease largely inexplicable [2].
Currently, the mainstay of allergy treatment is symp-
tomatic rather than targeted, using anti-inflammatories
and steroidal drugs, which themselves cause adverse
side effects [3]. Furthermore, certain non-allergic disor-
ders such as neutrophilic asthma, chronic obstructive
pulmonary disease (COPD) and skin irritation are often
misdiagnosed as allergic diseases [4, 5]. Therefore,
allergy phenotype-specific biomarkers for accurate
diagnosis, early prediction and targeted drug therapy
are of prime importance [6]. Biomarker database is
available for infectious diseases [7], but not for allergy
and asthma. Therefore, we felt there is an urgent need
to compile published research work for this disease, so
that the potential biomarkers of this disease can be
available on a single platform. Database of Allergy and
Asthma Biomarkers (DAAB) is a manually curated
repository of biomarkers of different types of allergic
diseases. We referred biomarkers as active genes/pro-
teins, which are found to be statistically significant in
differential expression profiling and considerably mod-
ulated in allergy and asthma diseases. About 2154
entries have been compiled by text mining of PubMed
abstracts followed by detailed manual curation, of
which 1022 entries from genomics, 419 from proteo-
mics, 16 entries from epigenetics and 210 entries from
other low-throughput studies. DAAB contains informa-
tion on identified biomarker accession numbers (NCBI
and UniProt), along with experimental approaches
(techniques, OMICS), disease phenotype and tissue sam-
ples types. In addition, it provides link to PubMed for
reference, Gene Expression Omnibus (GEO) and Proteo-
mics IDEntifications (PRIDE) databases for archiving the
microarray and mass spectrometry data sets, respec-
tively, Drug Bank for drug target and monoclonal anti-
body, if available, for potential therapy or further
downstream validation. The users can query through
user-friendly search page and browse the data using
alphabetical order of the biomarker gene symbols. The
data can also be downloaded in flat format. DAAB is
freely accessible and contains 1200 unique biomarkers.
The entire data in DAAB have been organized, and
users can retrieve and analyse these data in three ways
as shown in Fig. 1. First, it allows browse option to the
users. The database can be browsed using alphabetical
list of gene symbols. For example, on selecting ‘A’ (See
Fig. 2a), a list of 168 biomarkers having their gene
symbols starting with ‘A’ will appear as output (See
Fig. 2b). In addition, the data can be browsed using
four different experimental approaches such as Genom-
ics, Proteomics, Epigenetics and Others. For example,
hitting on ‘Genomics’ a list of 1252 biomarkers, which
have been obtained from various genomic platforms,
will appear as an output. Second, it allows search
option using keyword to the users. The keyword search
can be restricted by filtering data through either a gene
symbol or tissue sample utilized in an experiment or
OMICS approach used, by selecting from a drop-down
menu. Alternatively, any keyword can also be searched
by specifying ‘All’ option from this menu. This ‘search’
helps a user to explore any experiment that has been
carried out previously related to a particular keyword
as well as its relevance to any allergic diseases. For
example, searching with gene symbol ‘Muc5ac’ as a
keyword within ‘All’ data (See Fig. 2c) will show 10
records (See Fig. 2d), which implies that ‘Muc5ac’ gene
was found to be reported ten times in ten different
experiments cited in DAAB. Third, it allows BLAST
option torpidly compare the input query gene against
allergy and asthma biomarkers. To perform BLAST
analysis, users need to provide the amino acid sequence
of protein in FASTA format as input to search within
BLASTdb, as shown in Fig. 2e where the amino acid
sequence of a human Raf-1 proto-oncogene serine thre-
onine kinase is given as input. The output of BLAST
search will then appear with the best hit, the corre-
sponding BLAST score and an E-value describing the
significance of the search. The BLAST output page is
Correspondence:
Swati Gupta Bhattacharya, Division of Plant Biology (Main Campus), Bose
Institute, 93/1 AcharyaPrafulla Chandra Road, Kolkata-700009 West Bengal,
India.
E-mail: swati@jcbose.ac.in
and
Sudipto Saha, Bio-informatics Center, Bose Institute (Centenary Building),
P 1/12, C. I. T. Road, Scheme, VIIM, Kolkata, 700054 West Bengal, India.
E-mails: ssaha4@gmail.com, ssaha4@jcbose.ac.in
*Equal contribution.