Research Article
A Comprehensive Curation Shows the Dynamic
Evolutionary Patterns of Prokaryotic CRISPRs
Guoqin Mai,
1
Ruiquan Ge,
1,2
Guoquan Sun,
1
Qinghan Meng,
1,2
and Fengfeng Zhou
3,4
1
Shenzhen Institutes of Advanced Technology and Key Lab for Health Informatics, Chinese Academy of Sciences,
Shenzhen, Guangdong 518055, China
2
Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
3
College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China
4
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University,
Changchun, Jilin 130012, China
Correspondence should be addressed to Fengfeng Zhou; fengfengzhou@gmail.com
Received 24 January 2016; Revised 24 March 2016; Accepted 28 March 2016
Academic Editor: Hongwei Wang
Copyright © 2016 Guoqin Mai et al. his is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Motivation. Clustered regularly interspaced short palindromic repeat (CRISPR) is a genetic element with active regulation roles
for foreign invasive genes in the prokaryotic genomes and has been engineered to work with the CRISPR-associated sequence
(Cas) gene Cas9 as one of the modern genome editing technologies. Due to inconsistent deinitions, the existing CRISPR detection
programs seem to have missed some weak CRISPR signals. Results. his study manually curates all the currently annotated CRISPR
elements in the prokaryotic genomes and proposes 95 updates to the annotations. A new deinition is proposed to cover all
the CRISPRs. he comprehensive comparison of CRISPR numbers on the taxonomic levels of both domains and genus shows
high variations for closely related species even in the same genus. he detailed investigation of how CRISPRs are evolutionarily
manipulated in the 8 completely sequenced species in the genus hermoanaerobacter demonstrates that transposons act as a
frequent tool for splitting long CRISPRs into shorter ones along a long evolutionary history.
1. Introduction
A CRISPR is an array of repeat copies (DR, direct repeat)
connected by ixed-length linker sequences [1]. he linker
sequences are called spacers and are usually acquired from
the genetic elements invading the host microbial cells [2].
A CRISPR may be activated by its neighboring CRISPR-
associated (Cas) genes, and the spacers will be processed into
RNA molecular. he RNA form of spacers will repress the
activities of foreign elements with reverse-complementary
regions that reinvade the host cells [1–3]. Although CRISPRs
are only detected in microbial genomes in the nature, it
has been engineered as one of the major genomic editing
technologies for both animal and plant genomes [4, 5]. So it
is essential to study CRISPR’s evolutionary dynamic patterns.
Only a few computational tools were released to automat-
ically detect CRISPRs from a given genome, but they have
diferent default parameter settings for a CRISPR. PILER-CR
[6] screens for a repeat array using a local genomic self-
alignment and has (
3
) for the complexities of both time
and memory space, where is the genome length. PILER-CR
requires the DR length to be between 20 and 40 bps. CRT [7]
starts with the scanning for local repetitive -mers, which is a
nucleotide sequence with length . Due to its nature of local
scanning, CRT runs for linear time and within linear memory
space. Its default setting for DR lengths is between 21 and
37 bps. he latest tool CRISPRFinder [8] uses an existing tool
Vmatch to ind the DR array in a given genome and will dis-
card the tandem repeats as false positives. CRISPRFinder has
a slightly longer assumption for DRs between 20 and 47 bps.
A comprehensive database DbCRISPR was also published to
provide the CRISPR annotations for 2,762 microbial genomes
[9].
Due to the diferent default settings of existing tools for
a CRISPR structure, we hypothesize that a comprehensive
manual curation may reine the current CRISPR annotations
Hindawi Publishing Corporation
BioMed Research International
Volume 2016, Article ID 7237053, 7 pages
http://dx.doi.org/10.1155/2016/7237053