cells
Review
Strategies to Increase Prediction Accuracy in Genomic Selection
of Complex Traits in Alfalfa (Medicago sativa L.)
Cesar A. Medina
1
, Harpreet Kaur
2
, Ian Ray
2
and Long-Xi Yu
1,
*
Citation: Medina, C.A.; Kaur, H.;
Ray, I.; Yu, L.-X. Strategies to Increase
Prediction Accuracy in Genomic
Selection of Complex Traits in Alfalfa
(Medicago sativa L.). Cells 2021, 10,
3372. https://doi.org/10.3390/
cells10123372
Academic Editors: Francesco Mercati
and Francesco Carimi
Received: 19 October 2021
Accepted: 24 November 2021
Published: 30 November 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1
United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction and
Testing Research, Prosser, WA 99350, USA; cesar.medinaculma@wsu.edu
2
Department of Plant and Environmental Sciences, New Mexico State University, Las Cruces, NM 88003, USA;
harpr123@nmsu.edu (H.K.); iaray@nmsu.edu (I.R.)
* Correspondence: longxi.yu@usda.gov
Abstract: Agronomic traits such as biomass yield and abiotic stress tolerance are genetically complex
and challenging to improve through conventional breeding approaches. Genomic selection (GS) is an
alternative approach in which genome-wide markers are used to determine the genomic estimated
breeding value (GEBV) of individuals in a population. In alfalfa (Medicago sativa L.), previous
results indicated that low to moderate prediction accuracy values (<70%) were obtained in complex
traits, such as yield and abiotic stress resistance. There is a need to increase the prediction value
in order to employ GS in breeding programs. In this paper we reviewed different statistic models
and their applications in polyploid crops, such as alfalfa and potato. Specifically, we used empirical
data affiliated with alfalfa yield under salt stress to investigate approaches that use DNA marker
importance values derived from machine learning models, and genome-wide association studies
(GWAS) of marker-trait association scores based on different GWASpoly models, in weighted GBLUP
analyses. This approach increased prediction accuracies from 50% to more than 80% for alfalfa
yield under salt stress. Finally, we expended the weighted GBLUP approach to potato and analyzed
13 phenotypic traits and obtained similar results. This is the first report on alfalfa to use variable
importance and GWAS-assisted approaches to increase the prediction accuracy of GS, thus helping
to select superior alfalfa lines based on their GEBVs.
Keywords: genomic selection; WGBLUP; Medicago sativa
1. Introduction
Alfalfa (Medicago sativa L.) is an autotetraploid (2n =4x = 32) perennial forage crop
with a genome size of 800–1000 Mb [1]. However, alfalfa breeding is complicated by its
high heterozygosity, polysomic inheritance, and out-crossing nature, which hinder the
creation of inbred lines. Alfalfa breeding goals target improvement of forage yield, quality,
and tolerance to biotic and abiotic stresses. This process requires the selection of perennial
plants that can maintain biomass productivity and quality over several years. Therefore,
traits must be evaluated over multiple harvests each year for several years. Consequently,
genetic gain is slower compared to annual crops. In addition, alfalfa breeding programs
have largely focused on recurrent phenotypic selection (PS) in field environments to
improve quantitative traits of interest. However, this approach is constrained by breeding
population size, genotype × environment interactions, or low heritability of the trait, thus
hindering the development of superior varieties.
One promising alternative to recurrent PS is indirect selection based on the use of
molecular markers generated, for example, via genotyping by sequencing (GBS) [2]. Mark-
ers closely linked to quantitative trait loci (QTL) can then be used for marker-assisted
selection (MAS) in breeding programs. Initially, QTLs are detected through genetic map-
ping or genome-wide association studies (GWAS), where marker-trait associations that
exceed specific thresholds are declared statistically significant (Figure 1a). However, MAS
Cells 2021, 10, 3372. https://doi.org/10.3390/cells10123372 https://www.mdpi.com/journal/cells