Using artificial neural networks to select upright cowpea (Vigna unguiculata) genotypes with high productivity and phenotypic stability
Abstract
Cowpea (Vigna unguiculata) is grown in three Brazilian regions: the Midwest, North, and Northeast, and is consumed by people on low incomes. It is important to investigate the genotype x environment (GE) interaction to provide accurate recommendations for farmers. The aim of this study was to identify cowpea genotypes with high adaptability and phenotypic stability for growing in the Brazilian Cerrado, and to compare the use of artificial neural networks with the Eberhart and Russell (1966) method. Six trials with upright cowpea genotypes were conducted in 2005 and 2006 in the States of Mato Grosso do Sul and Mato Grosso. The data were subjected to adaptability and stability analysis by the Eberhart and Russell (1966) method and artificial neural networks. The genotypes MNC99-537F-4 and EVX91-2E-2 provided grain yields above the overall environment means, and exhibited high stability according to both methods. Genotype IT93K-93-10 was the most suitable for unfavorable environments. There was a high correlation between the results of both methods in terms of classifying the genotypes by their adaptability and stability. Therefore, this new approach would be effective in quantifying the GE interaction in upright cowpea breeding programs.
INTRODUCTION
Cowpea [Vigna unguiculata (L.) Walp.] is one of the most important and strategic food sources in tropical and subtropical regions of the world (Torres et al., 2015a). Brazil is the third-largest producer of this crop in the world, which is grown in the Midwest, North, and Northeast, and is consumed by people on low incomes (Oliveira et al., 2013). However, Almeida et al. (2012) reported that a supply deficit often occurs in these regions, because the average Brazilian yield is extremely low (300 kg/ha). One way of increasing yield is to identify genotypes with a high yield that are suitable for Brazilian soil and climatic conditions (Santos et al., 2014a).
Crop production depends on genetic and environmental factors, in addition to interactions between them, which when significant, result in differential genotype behavior in different environmental conditions (Cruz et al., 2012). Therefore, when quantifying the magnitude of the genotype x environment interaction (GE), we should identify stable genotypes with wide adaptation capacities that can be grown in a range of environments, i.e., genotypes adapted to unfavorable environments that are suitable for small farmers using low-tech equipment, and genotypes responsive to improved environments that are suitable for high-tech equipment.
Previous studies have attempted to select cowpea genotypes with both a wide adaptability and a high phenotypic stability in different Brazilian regions (Santos et al., 2014a,b). Several statistical methods have been used, including additive main effect and multiplicative interaction (Santos et al., 2015), a Bayesian approach (Teodoro et al., 2015a,b; Barroso et al., 2016), restricted maximum likelihood/best linear unbiased prediction (Torres et al., 2015b, 2016), and the Eberhart and Russell (1966) method, which is based on linear regression (Almeida et al., 2012; Barros et al., 2013; Nunes et al., 2014). These studies have assisted in the introduction and improvement of cowpea cultivars in several tropical regions, such as the Brazilian Cerrado (Teodoro et al., 2015a,b).
The Eberhart and Russell (1966) method is widely used in genetic assessments of stability and adaptability because of its easy application, use, and interpretation of results. However, when the number of environments assessed in a breeding program is low (usually less than six), the method is inconsistent, because it can result in a failure to reject the null hypothesis. In order to solve this problem, Nascimento et al. (2013) used artificial neural networks (ANNs) in combination with the Eberhart and Russell (1966) method to classify alfalfa genotypes. Following this approach, we simulated genotypes belonging to the phenotypic adaptability and stability classes defined by Eberhart and Russell (1966), which were subsequently used in the training and validation of ANNs.
ANNs are computational techniques that create a model that simulates a neural network, which is able to quickly process a large amount of data and recognize patterns based on self-learning (Haykin, 2009). After training the ANNs, we evaluated the genotypes for phenotypic stability and adaptability. This assessment was not only executed based on the genotypes studied, but on a large collection of simulated genotypes according to predefined classes (Nascimento et al., 2013). The aims of this study were to identify cowpea genotypes with high phenotypic adaptability and stability for growing in the Brazilian Cerrado and to compare the use of ANNs with the Eberhart and Russell (1966) method.
MATERIAL AND METHODS
Six trials were conducted in 2005 and 2006 in the municipalities of Aquidauana, Chapadão do Sul, and Dourados in the State of Mato Grosso do Sul and the municipality of Primavera do Leste, Mato Grosso (Table 1). The experiment had a randomized block design with 17 treatments and four replicates. The experimental unit consisted of four 5.0-m long rows that were spaced 0.5 m apart, with 0.25 m between plants within each row. In each experimental unit, grain yield was evaluated in the two central rows, and was corrected for 13% moisture and extrapolated to kg/ha.Environment (E), agricultural year (AY), site, latitude, longitude, altitude, Köppen’s classification, and sowing date of cowpea (Vigna unguiculata) genotypes in the State of Mato Grosso do Sul, Brazil.
E
AY
Site
Latitude
Longitude
Altitude
Köppen’s classification
Sowing date
1
2005
Aquidauana
22º01'S
54º05'W
430 m
Aw
March 21, 2005
2
2005
Chapadão do Sul
18º05'S
52º04'W
790 m
Aw
March 14, 2005
3
2005
Dourados
20º03'S
55º05'W
147 m
Cwa
April 7, 2005
4
2006
Aquidauana
22º01'S
54º05'W
430 m
Aw
March 2, 2006
5
2006
Dourados
20º03'S
55º05'W
147 m
Cwa
February 27, 2006
6
2006
Primavera
15º33'S
54°17'W
636 m
Aw
March 15, 2006
The data were subjected to individual analyses of variance (ANOVAs) for each environment, with the genotype effect fixed and the other effects random (Cruz et al., 2012), according to the following model:
(Equation 1)
where Yij is the value of the ith genotype in the jth block (i = 1,..., g and j = 1,..., b, g, and b being the number of genotypes and blocks, respectively); µ is the overall mean; Bj is the effect of the jth block; Gi is the effect of the ith genotype; and εij is the random error. A joint analysis of the trials was performed that included the effect of genotype as fixed and the other effects as random, according to the following model:
(Equation 2)
where Yij is the value of the ith genotype in the jth block in the kth environment (k = 1, ..., e, e being the number of environments); µ is the overall mean; Bj(k) is the effect of the jth block in k environment; Gi is the effect of the ith genotype; GE(ik) is the effect of the GE interaction; and εij is the random error. Subsequently, the data were submitted to adaptability and stability analysis by the Eberhart and Russell (1966) method and ANNs (Nascimento et al., 2013).
The method proposed by Eberhart and Russell (1966) is based on linear regression analysis, which measures the response of each genotype to environmental variation. Therefore, for an experiment with g genotypes, e environments, and r repetitions, we define the following statistical model:
(Equation 3)
where Yij is the mean of genotype i in environment j; β0i is the linear coefficient of the ith genotype; β1i is the regression coefficient that measures the response of the ith genotype to variation in environment j; and Ij is defined as the environmental index, by the following equation:
(Equation 4)
and Ψij are random errors, in which each component can be decomposed as the following equation:
(Equation 5)
where
(Equation 6)
and:
(Equation 7)
where MSDi is the mean square of deviations of genotype i and MSR is the mean squared residue. The hypotheses of interest were H0: β1i = 1 versus H1: β1i ≠ 1 and
For evaluating the adaptability and stability of genotypes by ANNs, two datasets are required: the training set and the testing set. To obtain these sets according to the classes defined, 1500 genotypes were simulated according to statistical model 1, and were evaluated in seven environments. The parameter values used for obtaining the genotypes of classes 1, 2, and 3 (Table 2), each consisting of 500 genotypes, were as follows: Class 1: Genotype classes according to the Eberhart and Russell (1966) method and their respective parametric values according to Nascimento et al. (2013).
Class
Practical classification
Parametric value
1
General adaptability and low predictability
2
Specific adaptability to favorable environments and low predictability
3
Specific adaptability to unfavorable environments and low predictability
4
General adaptability and high predictability
5
Specific adaptability to favorable environments and high predictability
6
Specific adaptability to unfavorable environments and high predictability
The ANNs used in this study, as denoted by a back-propagation hidden layer, are described by Nascimento et al. (2013). After training and testing the ANNs, which had a maximum error of 2% for the testing set, a cotton dataset was subjected to ANNs for classification. Subsequently, classification based on adaptability and stability was conducted; for comparison, this was also performed by the Eberhart and Russell (1966) method. The ANNs were implemented in R (R Development Core Team, 2011), and the Genes software (Cruz, 2013) was used for the Eberhart and Russell (1966) method.
RESULTS AND DISCUSSION
The individual ANOVAs revealed a significant block effect in all of the environments (Table 3), demonstrating that this design should be used in these types of experiments in order to control this source of heterogeneity. There were significant differences between the genotypes in all of the trials. The coefficients of variation obtained by the individual ANOVAs ranged between 23.08 and 34.08%, which were similar to those reported in other studies on cowpea (Rocha et al., 2007; Almeida et al., 2012; Santos et al., 2014a,b; Torres et al. 2015a,b). *Significant at the 5% probability level according to an F-test; SV, source of variation; d.f., degrees of freedom; CV, coefficient of variation; +environments described in Table 1.Summary of individual analyses of variance for grain yield (kg/ha) of 20 upright cowpea (Vigna unguiculata) genotypes in six environments (E) in the State of Mato Grosso do Sul, Brazil.
SV
d.f.
Mean square
E1+
E2
E3
E4
E5
E6
Block
3
584,978.33*
160,801.38*
171,117.54*
7,255.28*
133,215.19*
401,399.92*
Genotype
19
181,162.89*
141,462.97*
603,747.18*
44,836.59*
39,498.11*
70,157.38*
Error
57
66,525.70
49,454.98
45,592.55
5,559.47
5,127.79
17,996.46
Mean
-
1,155.25
910.62
924.79
218.74
210.53
554.89
CV (%)
-
22.32
24.42
23.08
34.08
34.01
24.17
A summary of the joint ANOVA results is presented in Table 4. The genotype effect was not significant (P > 0.05), suggesting an absence of genetic variability among the genotypes. However, Cruz et al. (2012) reported that when the genotype effect is significant in individual ANOVAs but not in a joint ANOVA, the genetic variability present is consumed by the magnitude of the GE interaction effect. *Significant at the 1% probability level according to an F-test; ns, not significant; +values adjusted according to the Cochran (1954) method.Summary of a joint analysis of variance for grain yield (kg·ha-1) of 20 upright cowpea (Vigna unguiculata) genotypes in six environments (E) in the State of Mato Grosso do Sul, Brazil.
Source of variation
Degrees of freedom
Mean square
Blocks/Environment
18
4376303.00
Genotype (G)
19
6232784.45ns
Environment (E)
5
62874783.73*
GE+
66
14303653.23*
Error+
221
10844647.31
Mean
-
662.47
Coefficient of variation (%)
-
33.43
Table 5 shows the mean grain yield and phenotypic adaptability and stability of the genotypes using the Eberhart and Russell (1966) method and ANNs. Genotypes MNC99-537F-4 and EVX91-2E-2 had higher grain yields than the overall average for the environments, and were highly stable according to both methods of analysis. Therefore, these genotypes are the most suitable for favorable environments and can be used by farmers that use high-tech equipment and procedures, because they can respond to environmental improvements in terms of fertilization and irrigation, among other practices. Low-tech farmers should grow the IT93K-93-10 genotype, which despite not having a higher grain yield than the overall average, was highly predictable according to both methods of analysis. Our results suggest that this genotype should maintain its production level under different environmental conditions.Mean grain yield and classification of 20 upright cowpea (Vigna unguiculata) genotypes based on phenotypic adaptability and stability by the Eberhart and Russell (1966) method and artificial neural networks in four environments in Mato Grosso do Sul, Brazil.
Genotype
Mean (kg/ha)
Eberhart and Russell (1966)
Artificial neural networks
Adaptability
Stability
Adaptability
Stability
MNC99-537F-1
725.58
Overall
Low
Overall
High
MNC99-537F-4
891.92
Favorable
High
Favorable
High
MNC99-541-F5
716.75
Overall
High
Overall
High
MNC99-541-F8
651.01
Favorable
High
Overall
High
IT93K-93-10
514.18
Unfavorable
High
Unfavorable
High
Pretinho
433.20
Overall
High
Overall
High
Fradinho-2
638.64
Overall
High
Overall
High
MNC99-519D-1-1-5
671.86
Overall
Low
Overall
High
MNC00-544D-10-1-2-2
602.69
Overall
High
Overall
High
MNC00-544D-14-1-2-2
722.08
Overall
High
Overall
High
MNC00-553D-8-1-2-2
641.91
Overall
Low
Overall
High
MNC00-553D-8-1-2-3
650.44
Overall
High
Overall
High
MNC00-561G-6
690.61
Favorable
High
Overall
High
EVX63-10E
682.57
Overall
High
Overall
High
MNC99542F-5
882.23
Overall
High
Overall
High
EVX91-2E-2
722.23
Favorable
High
Favorable
High
MNC99-557F-2
494.64
Overall
Low
Overall
High
BRS Guariba
667.20
Overall
High
Overall
High
Patativa
753.34
Overall
High
Overall
High
Vita-7
496.39
Unfavorable
Low
Unfavorable
High
Agreement
Adaptability 90%
Stability 75%
There was 90% agreement between the Eberhart and Russell (1966) method and ANNs in terms of the phenotypic adaptability of the genotypes (Table 5), and 75% agreement in terms of the phenotypic stability; this was lower than the adaptability value, probably because ANN stability is based on the Finlay and Wilkinson (1963) method, which differs from the Eberhart and Russell (1966) method by considering stability, invariance, and non-predictability. The strong agreement between the traditional Eberhart and Russell (1966) method and ANNs has been reported in studies that evaluated the GE interaction in genotypes of alfalfa (Nascimento et al., 2013), semi-prostate cowpea (Teodoro et al., 2015a), and common bean (Correa et al., 2016). This new approach is an effective method of quantifying the adaptability and stability of different genotypes in upright cowpea breeding programs. The main advantage of ANNs over the Eberhart and Russell (1966) method is that because of their non-linear structure (Haykin, 2009), they can capture the most complex features of a dataset without requiring detailed information about the process to be modeled, because they are self-learning (Nascimento et al., 2013).