cells Review Strategies to Increase Prediction Accuracy in Genomic Selection of Complex Traits in Alfalfa (Medicago sativa L.) Cesar A. Medina 1 , Harpreet Kaur 2 , Ian Ray 2 and Long-Xi Yu 1, *   Citation: Medina, C.A.; Kaur, H.; Ray, I.; Yu, L.-X. Strategies to Increase Prediction Accuracy in Genomic Selection of Complex Traits in Alfalfa (Medicago sativa L.). Cells 2021, 10, 3372. https://doi.org/10.3390/ cells10123372 Academic Editors: Francesco Mercati and Francesco Carimi Received: 19 October 2021 Accepted: 24 November 2021 Published: 30 November 2021 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). 1 United States Department of Agriculture-Agricultural Research Service, Plant Germplasm Introduction and Testing Research, Prosser, WA 99350, USA; cesar.medinaculma@wsu.edu 2 Department of Plant and Environmental Sciences, New Mexico State University, Las Cruces, NM 88003, USA; harpr123@nmsu.edu (H.K.); iaray@nmsu.edu (I.R.) * Correspondence: longxi.yu@usda.gov Abstract: Agronomic traits such as biomass yield and abiotic stress tolerance are genetically complex and challenging to improve through conventional breeding approaches. Genomic selection (GS) is an alternative approach in which genome-wide markers are used to determine the genomic estimated breeding value (GEBV) of individuals in a population. In alfalfa (Medicago sativa L.), previous results indicated that low to moderate prediction accuracy values (<70%) were obtained in complex traits, such as yield and abiotic stress resistance. There is a need to increase the prediction value in order to employ GS in breeding programs. In this paper we reviewed different statistic models and their applications in polyploid crops, such as alfalfa and potato. Specifically, we used empirical data affiliated with alfalfa yield under salt stress to investigate approaches that use DNA marker importance values derived from machine learning models, and genome-wide association studies (GWAS) of marker-trait association scores based on different GWASpoly models, in weighted GBLUP analyses. This approach increased prediction accuracies from 50% to more than 80% for alfalfa yield under salt stress. Finally, we expended the weighted GBLUP approach to potato and analyzed 13 phenotypic traits and obtained similar results. This is the first report on alfalfa to use variable importance and GWAS-assisted approaches to increase the prediction accuracy of GS, thus helping to select superior alfalfa lines based on their GEBVs. Keywords: genomic selection; WGBLUP; Medicago sativa 1. Introduction Alfalfa (Medicago sativa L.) is an autotetraploid (2n =4x = 32) perennial forage crop with a genome size of 800–1000 Mb [1]. However, alfalfa breeding is complicated by its high heterozygosity, polysomic inheritance, and out-crossing nature, which hinder the creation of inbred lines. Alfalfa breeding goals target improvement of forage yield, quality, and tolerance to biotic and abiotic stresses. This process requires the selection of perennial plants that can maintain biomass productivity and quality over several years. Therefore, traits must be evaluated over multiple harvests each year for several years. Consequently, genetic gain is slower compared to annual crops. In addition, alfalfa breeding programs have largely focused on recurrent phenotypic selection (PS) in field environments to improve quantitative traits of interest. However, this approach is constrained by breeding population size, genotype × environment interactions, or low heritability of the trait, thus hindering the development of superior varieties. One promising alternative to recurrent PS is indirect selection based on the use of molecular markers generated, for example, via genotyping by sequencing (GBS) [2]. Mark- ers closely linked to quantitative trait loci (QTL) can then be used for marker-assisted selection (MAS) in breeding programs. Initially, QTLs are detected through genetic map- ping or genome-wide association studies (GWAS), where marker-trait associations that exceed specific thresholds are declared statistically significant (Figure 1a). However, MAS Cells 2021, 10, 3372. https://doi.org/10.3390/cells10123372 https://www.mdpi.com/journal/cells