SOFTWARE Open Access
WebGIVI: a web-based gene enrichment
analysis and visualization tool
Liang Sun
1,6
, Yongnan Zhu
2,3
, A. S. M. Ashique Mahmood
4
, Catalina O. Tudor
4
, Jia Ren
5
, K. Vijay-Shanker
4
,
Jian Chen
2
and Carl J. Schmidt
1*
Abstract
Background: A major challenge of high throughput transcriptome studies is presenting the data to researchers in
an interpretable format. In many cases, the outputs of such studies are gene lists which are then examined for
enriched biological concepts. One approach to help the researcher interpret large gene datasets is to associate
genes and informative terms (iTerm) that are obtained from the biomedical literature using the eGIFT text-mining
system. However, examining large lists of iTerm and gene pairs is a daunting task.
Results: We have developed WebGIVI, an interactive web-based visualization tool (http://raven.anr.udel.edu/webgivi/)
to explore gene:iTerm pairs. WebGIVI was built via Cytoscape and Data Driven Document JavaScript libraries and can
be used to relate genes to iTerms and then visualize gene and iTerm pairs. WebGIVI can accept a gene list that is used
to retrieve the gene symbols and corresponding iTerm list. This list can be submitted to visualize the gene iTerm pairs
using two distinct methods: a Concept Map or a Cytoscape Network Map. In addition, WebGIVI also supports
uploading and visualization of any two-column tab separated data.
Conclusions: WebGIVI provides an interactive and integrated network graph of gene and iTerms that allows filtering,
sorting, and grouping, which can aid biologists in developing hypothesis based on the input gene lists. In addition,
WebGIVI can visualize hundreds of nodes and generate a high-resolution image that is important for most of research
publications. The source code can be freely downloaded at https://github.com/sunliang3361/WebGIVI. The WebGIVI
tutorial is available at http://raven.anr.udel.edu/webgivi/tutorial.php.
Keywords: Visualization, eGIFT, Gene iTerm, Gene enrichment, Web development
Background
High-throughput technologies provide biologists with
large lists of genes or proteins when they compare
expression data between two biological states (e.g., nor-
mal tissue vs. cancer tissue). Grouping enriched genes to
known biological processes and pathways is a common
strategy for understanding the biology that underlies the
differences between the two states. Approaches include
GO enrichment analysis such as DAVID [1, 2], GOEAST
[3] and Gorilla [4], and pathway analysis such as KEGG
[5] and Reactome [6].
eGIFT
eGIFT [7] uses a text-mining method to identify inform-
ative terms (iTerms) for individual genes. iTerms are not
limited to gene ontology (GO) terms; they also capture
more detailed biological knowledge. Consequently,
eGIFT provides a finer grained interpretation of gene
lists than GO analysis. The current gene analysis results
of eGIFT provide users with a list of ranked iTerms and
their associated genes in a tabular format. A graphic
representation of these gene and iTerm relations would
allow biologists to better interpret their input gene lists
or gene-iTerm pair lists. This often captures the bio-
logical concept enriched in the input data.
Visualization tool
An effective visualization of large data sets can provide
biologists with means to discover buried relationships in
complex data sets. Currently, several different visualization
* Correspondence: schmidtc@udel.edu
1
Department of Animal and Food Sciences, University of Delaware, Newark,
DE, USA
Full list of author information is available at the end of the article
© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Sun et al. BMC Bioinformatics (2017) 18:237
DOI 10.1186/s12859-017-1664-2