Commentary
Classification Problems of Repetitive DNA Sequences
Eva Šatovi´ c-Vukši´ c* and Miroslav Plohl
Citation: Šatovi´ c-Vukši´ c, E.; Plohl, M.
Classification Problems of Repetitive
DNA Sequences. DNA 2021, 1, 84–90.
https://doi.org/10.3390/dna1020009
Academic Editor: Darren Griffin
Received: 10 August 2021
Accepted: 11 October 2021
Published: 2 November 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Division of Molecular Biology, Ru ¯ der Boškovi´ c Institute, 10000 Zagreb, Croatia; plohl@irb.hr
* Correspondence: esatovic@irb.hr
Abstract: Repetitive DNA sequences, satellite DNAs (satDNAs) and transposable elements (TEs)
are essential components of the genome landscape, with many different roles in genome function
and evolution. Despite significant advances in sequencing technologies and bioinformatics tools,
detection and classification of repetitive sequences can still be an obstacle to the analysis of genomic
repeats. Here, we summarize how specificities in repetitive DNA organizational patterns can lead
to an inability to classify (and study) a significant fraction of bivalve mollusk repetitive sequences.
We suggest that the main reasons for this inability are: the predominant association of satDNA
arrays with Helitron/Helentron TEs; the existence of many complex loci; and the unusual, highly
scattered organization of short satDNA arrays or single monomers across the whole genome. The
specificities of bivalve genomes confirm the need for introducing diverse organisms as models in
order to understand all aspects of repetitive DNA biology. It is expected that further development
of sequencing techniques and synergy among different bioinformatics tools and databases will
enable quick and unambiguous characterization and classification of repetitive DNA sequences in
assembled genomes.
Keywords: repetitive DNA classification; satellite DNA; transposable element; Helitron/Helentron;
bivalves; genome assemblies
1. Introduction
Despite the exponential number of genome sequencing projects arising and spanning
all taxa, genomic regions largely composed of repetitive DNA sequences still present
substantial technical issues in the assembly of genomes [1]. Repetitive DNAs are mainly
constituted of satellite DNAs (satDNAs), formed by sequences repeated in tandem, and
of mobile elements, interspersed throughout the genome [2]. According to the estab-
lished classical view, satDNAs are associated with constitutive heterochromatin which
is commonly located at pericentromeric and subtelomeric chromosomal domains and at
interstitial loci of the chromosomal arms. They build long arrays of monomers repeated in
tandem, comprised of hundreds to thousands of highly similar repeat units [3]. However,
more recent work has introduced new data and showed that satDNA sequences can also
be located outside of heterochromatin, where they can be found in different organizational
forms: as monomers or monomer fragments, in arrays of diverse length or incorporated
into mobile elements, for example [4–10]. In addition, many links show that satDNAs
and mobile elements are often tightly interconnected. For example, tandem repeats can
be created from mobile elements or their segments, or satDNAs can expand from short
internal arrays carried by mobile elements (reviewed in [11]).
Sequencing problems arise in attempts to reconstruct repetitive genomic segments,
and, subsequently, these regions are still regularly omitted or are misassembled in the
available genomic data [12]. Ongoing improvements in sequencing technologies (e.g.,
long-read PacBio and Nanopore sequencing) are opening the possibility to obtain insights
into these missing fractions of assembled genomes [13]. At the same time, a number of
programs and software aimed to forward repeat detection and characterization are being
generated and/or upgraded (reviewed in [14]), substantially changing our knowledge on
DNA 2021, 1, 84–90. https://doi.org/10.3390/dna1020009 https://www.mdpi.com/journal/dna