Assessment phenotypic diversity of rice (Oryza sativa L.) genotypes by multivariate analysis

An efficient way to achieve superior productivity is to determine the genetic variation of the various rice genotypes. This research was aimed at estimating potential variations between rice genotypes and identifying each trait contribution in total variation and classifying superior genotypes. The experiment was performed at the Rice Research and Training Centre, Sakha, Kafr El-Sheik, Egypt. Twenty-two genotypes of rice were analyzed using seven agronomic traits. Multivariate approaches were utilized including principal components and cluster analysis. Results showed that PC1 and PC2 represented 66.1% of the variation between studied genotypes, mainly 48% because of grain yield per plant variation in PC1 followed by the characteristics of its components i.e., number of panicles per plant, number of filled grains per panicle, and 1000-grain weight. The three Egyptian rice genotypes Giza 181, Giza 178, and Giza 177 were the best genotypes for grain yield. Cluster results revealed that the majority of genotypes originated from one source (except for Indian variety IET1444) or belonged to one classification were clustered together. Multivariate analytical approaches are ideal instruments for providing information on agronomic character variations. Consequently, the results of the current study should be taken into account when developing new rice varieties.


INTRODUCTION
Rice (Oryza sativa L.) is one of the Gramineae family crops, the daily meal of over two-thirds of the world's population and provides approximately 20% of a human's calories consumed (Wogu et al., 2010). World rice production amounts are 782 million tons over the total area of approximately 167 million hectares (FAO, 2018). It is cultivated commercially in several countries in the world. It is particularly important in Asia (90.7%), America (5.2%), Africa (3.4%), Europe (0.6%) and Oceania (0.1%) as a commercial crop (FAO, 2018). China, India, Indonesia, Bangladesh, Vietnam, Thailand, Myanmar, Philippines, Brazil, Japan are the top ten rice producers worldwide in 2018 (FAO, 2018). In Egypt, it has a production area of 4280 metric tons at an area of 0.445 million/ha and a productivity of 9.6 tons/ha. There are still a lot of challenges to achieving food security worldwide. Therefore, there is a need to improve high-yielding varieties. Adequate knowledge of genetic variation in different genotypes is a preliminary step in breeding programs for the selection and production of new varieties (Kumbhar et al., 2015;Ahmed et al., 2016). Rice genotype diversity is an essential method of transmission of genetic information (Nambara & Nonogaki, 2012;Martínez-Andújar et al., 2012). The recent method, the biplot technique, provides breeders with a complete visual representation of all aspects of variables by producing a biplot that simultaneously represents both mean efficiency and stability (Yan & Holland, 2010;Gedif & Yigzaw, 2014;Yan & Frégeau-Reid, 2018;Bányai et al., 2020). Multivariate analysis is a sufficient measurement of the degree of difference between genotypes. Principle component analysis (PCA) and cluster analysis are both used to assess variation as multivariate methods (Maji & Shaibu, 2012;Tiwari et al., 2020). PCA is used to study diversity and to determine commitment to several specific characteristics. Cluster analysis is used when we need to classify genotypes according to genetic or agronomic traits into different groups (Shabir et al., 2013). Also, Nachimuthu et al. (2014) used PCA to assess variation between rice genotypes derived from various countries. They studied many characteristics like filled grains number per panicle, grain yield per plant, panicle length and plant height. They found that there was a high coefficient of variation for the number of filled grains per panicle and grain yield per plant and when using PCA. They defined traits that had an impact on variation. However,  assessed the variability of the rice genotypes by PCA. Results showed that the number of grains per panicle affected the variety of genotypes studied. Also, Sanni et al. (2012) used cluster analysis to evaluate the variability of 434 rice genotypes in 10 agronomic traits under upland conditions. Their results showed that seven groups were derived from ten agronomic and botanical traits. They stated that certain clusters in breeding programs are valuable in determining parental genotypes. The objective of this research was to (i) estimate the possible diversity between the rice genotypes studied by multivariate analysis, (ii) define the contribution of each studied trait to the total variation under the irrigated condition and (iii) determine the best genotypes for the breeding of new high yielding cultivars under irrigated condition in future.

Plant Materials
On the Rice Research and Training Center (RRTC) experimental, Sakha Agricultural Research Station, Kafr El-Sheikh Governorate, Egypt, twenty-two rice (Oryza sativa L.) genotypes originating from different countries were evaluated for various quantitative traits ( Table 1)

Field experiments
The research was conducted at the Rice Research and Training Center (RRTC) experimental farm, Sakha, Kafr El-Sheikh, Egypt, in the two-rice successive cultivated seasons 2017 and 2018. The seeds of the twenty-two rice genotypes were grown in the nursery and transplanted in four rows after 30 days from sowing. The experiment was arranged in a randomized complete block design (RCBD) with three replicates. Each replicate consists of three rows, 5 m long, 20 cm in length between rows, comprising 25 hills each of a single plant. The recommended cultural rice practices have been applied as usual for the ordinary rice field in the region. A specific dose of N: P: K (10: 18: 18) was applied during ground preparation at a rate of 80 kg/ha. Urea was applied at a rate of 80 kg/ha, first at tillering and second dose at booting. Hand-weeding was done regularly to minimize the infestation of weeds.

Data Collection
Seven quantitative traits were determined using the rice (O. sativa) descriptor methods (IRRI, 1980). The variables included in the descriptive and multivariate analysis were phenological (for each plot the number of days to heading) and agro-quantitative (plant height (cm), panicle length (cm), number of panicles per plant, number of filled grains per panicle, 1000-grain weight (g) and grain yield per plant (g)). The data were calculated based on ten randomly selected guarded plants from each plot. Data were recorded on a single plant-dependent basis for each genotype.

Statistical Analysis
Mean averaged data were obtained over two years and used to measure mean values. The twenty-two genotypes were divided into groups using the method of unweighted pair group system of average linkage (UPGMA) and studied traits were analyzed using principal component analysis (PCA). The datasets were first tested with normality tests for normality and then subjected to variance analysis using an appropriate model (Anderson & Darling, 1952). PCA was used to determine the degree of characteristic variance between genotypes based on (Everitt & Dunn, 1992). While biplot analysis was used to select the best performing genotype. PCA as one of the factor analysis methods is a reduction of a large number of variables correlated with smaller sets of variables called factors or components (Cattell, 1965). The group array, the amount of variance expressed by the combined variables, was estimated by the highest correlation coefficient in each array, as suggested by Seiller & Stafford (1985). The experimental results were statistically analyzed using Genstat version 12.0 software for multivariate analysis. The UPGMA was performed using the statistical method of numerical taxonomic and multivariate analysis methods (NTSYS version 2.1 software package (Rohlf, 1998).   (Table 2). Diverse frequency distribution models for the characters studied resulted when the Scree histogram was applied ( Figure 1).

Principle Component Analysis
PCA is a method used to identify character contribution in variation among different genotypes which could be a valuable selection tool for rice improvement. The PCA result has demonstrated the rice collection's genetic diversity. Proper values measure the importance and contribution of each component to total variance, while each value indicates the degree of contribution of each original variable associated with each main component. Seven PCA were collected for the seven characteristics, but only the first two PCA (PC1 and PC2) were approved to show the variation in the studied characteristics as they reported an eigenvalue greater than 1.0 (3.3575, 1.2647), respectively (Figure 2), while the other components were rejected because they were less than one resulting in an eigenvalue. In the principal component analysis of the relationships between the measured parameters, under irrigated replications, the eigenvalue of two principle components was greater than one, which together represented 66.1% of the total variance in which the PC1 had the highest variability (48%) followed by PC2 (18.1%) (Figure 2). Whereas other major components were neglected (Table 3).
The first principal component PC1 is related to the contribution of five characteristics, r = 0.504 for GYP, r = 0.478 for NPP, r = 0.415 for NFGP, r = 0.401 for TGW and r = -0.321 for DH (P<0.01) (Figure 3). Meanwhile, the second principal component PC2 correlated only with two traits, r = -0.693 for PH and r = 0.499 for PL (Table 4) (Figure 4).

Identifying Superior Rice Genotypes Under Irrigated Conditions
As a result, the traits in PC1 traits had the greatest influence on the total variation of the genotypes studied followed by traits in PC2 (Table 5). This revealed varying yield output among the rice genotypes. To visualize the relationship between genotypes, the origin of the plot was drawn to connect. So, in breeding programs, they could be used significantly to identify superior rice genotype. Figure 5 presents the plot distribution of both genotypes and PCA traits. This revealed that the number of panicles per plant is the closest correlated trait with grain yield per plant since they have a small angle between them, followed by 1000-grain weight and then filled grains number per plant. All traits were positively correlated with PC1 because they had scored more than zero except for days to heading and plant height. Genotypes distributed in the four quarters of the biplot revealed that both Giza 178 and Giza 181 rice genotypes were the most representative of variation as their scores were positive and more than other PC1 varieties followed at the same time by the genotype Giza 177. These three varieties were associated with grain yield per plant and the number of panicles per plant ( Figure 5).

Cluster Analysis
Cluster analysis results according to dissimilarity (Table 6) showed five main cluster groups, four of which contained subclusters.  (Figure 6).

DISCUSSION
The variation between the genotypes studied was mainly due to the highest variance in grain yield per plant followed by The for both grain yield per plant and plant height was 23.19 and 18.82, respectively. We followed in this study (Clifford & Stephenson, 1975), which improved by Guei et al. (2005) who mentioned that (PC1, PC2, PC3) are usually more effective in symbolizing models of variability significantly between genotypes and that the traits correlated with them are more useful in distinguishing genotypes. According to this criterion, the first two main components were gathered, which means that these characteristics will play a major role in providing a wide variation for rice improvement. This agrees with Guei et al. (2005) who confirmed that traits in PC1, PC2 reflect variation in rice genotypes. Days to heading were found in one principle component with yield and features of its components, while plant height in another PC meaning that the days to heading had more effect in the variation than plant height. This is in agreement with Rashid et al. (2008) who found that days to heading and grain yield are located in PC1 and with Ogunbayo et al. (2005) who declared that plant height was found in PC2. Grain yield per plant and yield traits (1000-grain weight, number of filled grains per panicle and panicle number per plant) affect positively the PC1 on the contrary of days to heading that hurt the same PC1. This indicates the increase in grain yield per plant and yield traits (panicle number per plant, number of filled grains per panicle, 1000-grain weight) as a result of early heading. To be sure, improving a given yield trait will direct the improvement of other yield traits collected in the same PC as long as they have the same positive effect. In PC2, the plant height had a negative effect while the panicle length had a positive effect, suggesting that smaller plants have smaller panicles. These findings are consistent with other studies (Caldo     Both the principal component and the cluster methods are effective tools for providing information on variability in agronomic traits. Use multiple trait profiles evaluated under the target environment may help boost improvements in selection in variable environments. So, when developing new rice varieties, it is recommended to consider the findings of this study.