RESEARCH ARTICLE
Analysis of the nucleotide content of Escherichia coli promoter
sequences related to the alternative sigma factors
Gabriel Dall'Alba
1
|
Pedro Lenz Casa
1
|
Daniel Luis Notari
2
|
Andre Gustavo Adami
2
|
Sergio Echeverrigaray
1
|
Scheila de Avila e Silva
2
1
Department of Life Sciences, Universidade
de Caxias do Sul, Caxias do Sul, Rio Grande do
Sul, Brazil
2
Department of Exact Sciences, Universidade
de Caxias do Sul, Caxias do Sul, Rio Grande do
Sul, Brazil
Correspondence
Scheila de Avila e Silva. Department of Exact
Sciences, Universidade de Caxias do Sul, Rua
Francisco Getúlio Vargas, 1130, Petrópolis,
Caxias do Sul, Rio Grande do Sul, CEP
95070560, Brazil.
Email: sasilva6@ucs.br
Funding information
Universidade de Caxias do Sul (UCS)
Abstract
Promoters are DNA sequences located upstream of the transcription start site of
genes. In bacteria, the RNA polymerase enzyme requires additional subunits, called
sigma factors (σ) to begin specific gene transcription in distinct environmental condi-
tions. Currently, promoter prediction still poses many challenges due to the character-
istics of these sequences. In this paper, the nucleotide content of Escherichia coli
promoter sequences, related to five alternative σ factors, was analyzed by a machine
learning technique in order to provide profiles according to the σ factor which recog-
nizes them. For this, the clustering technique was applied since it is a viable method
for finding hidden patterns on a data set. As a result, 20 groups of sequences were
formed, and, aided by the Weblogo tool, it was possible to determine sequence pro-
files. These found patterns should be considered for implementing computational pre-
diction tools. In addition, evidence was found of an overlap between the functions of
the genes regulated by different σ factors, suggesting that DNA structural properties
are also essential parameters for further studies.
KEYWORDS
bacterial transcription, bioinformatics, clustering technique, promoters, sigma factor
1
|
INTRODUCTION
In prokaryotes, the specificity of gene expression is regulated by
proteic subunits known as sigma factors (σ). They are responsible
for guiding the catalytic core RNA polymerase to specific promoter
sequences located upstream of the transcription start site (TSS) of a
gene. The constant swap between σ factors that bind into the RNAP
results in the transcription of different groups of genes, each one
with different expression patterns. Bacteria such as Escherichia coli
and related Gamma‐proteobacteria maintain constant expression of
genes recognized by the major σ factor, also known as σ
70
or the
“housekeeping” factor. Additionally, there are alternative σ factors
associated with specific and programmed responses. Each
σ‐dependent promoter sequence presents different conserved
motifs, which distinguish themselves from one another.
1
In total, six
distinct alternative σ factors are known in E. coli, for instance: σ
19
,
σ
24
, σ
28
, σ
32
, σ
38
, and σ
54
.
Briefly, σ
24
and σ
32
(encoded by RpoE and RpoH, respectively) are
known for regulating heat shock response genes. A sudden heat
shock, if not rapidly answered to, results to protein unfolding which,
in turn, may lead to the cell's death.
2
Therefore, a rapid mobilization
of σ factors that produces heat shock proteins is required in order to
tackle such rapid environmental changes. The σ
28
(product of the FliA
gene) regulates, mostly, genes related to flagellar synthesis and cell
motility. Additionally, pathogenicity and virulence can also be linked
to this σ factor.
3,4
The σ38 is a product of the RpoS gene and regulates genes that
participate in the general stress response of a bacteria. Landini et al
(2014)
5
points out that the σ
38
is not particularly essential for growth
(in both presence or absence) but is the responsible for very sensible
Received: 8 August 2018 Revised: 23 October 2018 Accepted: 24 October 2018
DOI: 10.1002/jmr.2770
J Mol Recognit. 2018;e2770.
https://doi.org/10.1002/jmr.2770
© 2018 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/jmr 1 of 7