Regularized quantile regression applied to genome-enabled prediction of quantitative traits.
Genomic selection (GS) is a variant of marker-assisted selection, in which genetic markers covering the whole genome predict individual genetic merits for breeding. GS increases the accuracy of breeding values (BV) prediction. Although a variety of statistical models have been proposed to estimate BV in GS, few methodologies have examined statistical challenges based on non-normal phenotypic distributions, e.g., skewed distributions. Traditional GS models estimate changes in the phenotype distribution mean, i.e., the function is defined for the expected value of trait-conditional on markers, E(Y|X). We proposed an approach based on regularized quantile regression (RQR) for GS to improve the estimation of marker effects and the consequent genomic estimated BV (GEBV). The RQR model is based on conditional quantiles, Q(Y|X), enabling models that fit all portions of a trait probability distribution. This allows RQR to choose one quantile function that "best" represents the relationship between the dependent and independent variables. Data were simulated for 1000 individuals. The genome included 1500 markers; most had a small effect and only a few markers with a sizable effect were simulated. We evaluated three scenarios according to symmetrical, positively, and negatively skewed distributions. Analyses were performed using Bayesian LASSO (BLASSO) and RQR considering three quantiles (0.25, 0.50, and 0.75). The use of RQR to estimate GEBV was efficient; the RQR method achieved better results than BLASSO, at least for one quantile model fit for all evaluated scenarios. The gains in relation to BLASSO were 86.28 and 55.70% for positively and negatively skewed distributions, respectively.