Dissection of genetic diversity present in eggplant populations using simple sequence repeat markers

Eggplant ( Solanum melongena L.) is the third most important solanaceous vegetable and most diversified within species spread across the world-geographical area. A study was conducted to assess the genetic diversity among the selected fifty-four eggplant genotypes (sub-categorized into five sub-population) using twenty-three SSR markers. The Analysis of Molecular Variance among the five sub-population of eggplant revealed the existence of 90.67% variation within populations and 9.34% variation among populations. The SSR markers analysis revealed important locus-wise information like mean Observed-Heterozygosity (0.216), mean Expected-Heterozygosity (0.496), Shannon’s Information Index (0.879), mean number of different alleles (3.209), mean number of effective alleles (2.535), Fixation-Index (0.649). Further, Phylogenetic-analysis clearly categorize genetically distinct individuals in which the most diversified clusters was cluster-1 (C1) out of total of five clusters and especially, wild cultivars were grouped into cluster-5 (C5). The obtained results can be used in eggplant breeding and germplasm conservation in a resourceful manner.


INTRODUCTION
Eggplant (Solanum melongena L.), is one of the important Solanaceous vegetables grown in tropical and subtropical regions of the world including India. It is a warm season herbaceous perennial but grown as an annual crop for commercial purposes and is popularly called as Brinjal in India. Eggplant is rich in carbohydrates, proteins, fat, dietary fibres, vitamins and minerals properties (Somawathi et al., 2014).
The assessment of genetic features is a vital process for breeders to produce a new genetic line or to improve an existing one further. India as being major diversity hub for eggplant the occurrence of variation for various traits is high. Since there is a need to evaluate and characterize the eggplant basic material which is distributed in multiple areas. Earlier, genetic evaluation was mainly based on morphological, (Faizan et al., 2021a) physiological (Faizan et al., 2021b) and biochemical traits (isozymes and chromatography) (Weijun, 1992;Isshiki & Fujieda 1994). Presently, molecular markers have massive and latent to discover genetic diversity by identifying polymorphisms. Molecular diversity analysis is one of the powerful tools for genome determination, genotype recognition, and studying the evolution pattern of crop plants and utilizing it in further breeding programmes. Several works related to molecular genetic diversity analysis in eggplant have been reported using different types of molecular markers viz., RAPD (Hu & Quiros, 1991;Tiwari et al., 2009;Ali et al., 2011), AFLP (Liao et al., 2009), SCARs (Liao et al., 2009), ISSR (Tiwari et al., 2009) and SSR (Jellan et al., 2016;Mikaela et al., 2017).
During recent eras, SSR molecular markers are becoming more popular because of their co-dominant inheritance, high abundance, enormous extent of allelic diversity, ease of assessing SSR size variation through PCR and high reproducibility (Stagel et al., 2008). The development of SSR markers derived from the SSR-enriched genomic library of eggplant was reported by Nunome et al. (2003Nunome et al. ( , 2009).

Dissection of genetic diversity present in eggplant populations using simple sequence repeat markers
The goal of the current investigation was to characterize eggplant genotypes and to study the relationship between the genotypes which fall under different subgroups using SSR molecular marker. Valuation of genetic diversity is imperative for breeding purposes, and the exploitation of molecular markers helps fast-track the evaluation progression.

Plant Material
Around fifty-four eggplant genotypes were obtained from different sources and categorised into different sub-groups based on their type or kind viz a local cultivars (9 genotypes), b commercial hybrids (2 genotypes) c released varieties (15 genotypes), d advanced breeding lines (24 genotypes), and e wild relatives/related species of eggplant (4 species) ( Table 1).

DNA extraction
DNA samples from fresh fully opened leaf tissue of each genotype were extracted using the CTAB method (Doyle & Doyle, 1987). DNA quality and quantity were assessed on a 0.8 % agarose gel stained with ethidium bromide and also by using a NanoDrop® ND-1000 spectrophotometer respectively.

SSR Analysis
Twenty-three SSRs markers were selected to the evaluate genetic diversity present among eggplant germplasm. The selection of eggplant SSRs was based on their high polymorphism information content and the quality scores reported by Nunome et al. (2009), Jellan et al. (2016 and Mikaela et al. (2017) ( Table 2).
The polymerase chain reaction (PCR) mixture confined with >80 ng DNA, 5 pmol of each primer, PCR master mix Ampliqon® and nuclease free water, total PCR mixture composed of 10 µl.
PCR amplification was achieved using the Eppendorf® PCR System. The amplification conditions involved an initial step of 3 min at 94℃, followed by 35 cycles of 1 min at 94℃, 1 min at 55-69℃ and 1 min at 72℃, final extension at 72℃ for 10 min and final withhold temperature with 4℃. The obtained PCR products were gel electrophorized using 2.5% agarose gel and the PCR profile image was captured using a gel documentation system (Syngene Pvt. Ltd., USA).
The scoring of the PCR profile was done by using the software GeneTool (Syngene Pvt. Ltd., USA). A 100 bp ladder was used as a standard molecular weight size marker for each gel alongside the DNA samples. The bands were scored based on the weightage of DNA fragments. The analysis was repeated at least twice to confirm the reproducibility of the results and the variation for each band was observed around 5 to 10 bp.

Molecular Analysis
The molecular analysis was done using GenAlEx V.6.0 software for PCA derivation and parameter calculation of the number eme01G19 ( of different alleles (Na), number of effective alleles (Ne), Shannon's Information Index (I), Observed Heterozygosity (Ho), Expected Heterozygosity (He), number of migrants (Nm), Polymorphic Information Content (PIC) value, Fixation Index (F) and F-statistics. The analysis of molecular variance was determined from Arlequin V.3.0 (Excoffier & Lischer, 2010). A Neighbour-Joining (NJ) tree ( Figure 3) was constructed across the samples to analyze the genetic relationship among the individuals and populations using MEGA V.5.0 software.

Analysis of Molecular Variance (AMOVA)
Analysis of Molecular Variance (AMOVA) with a fixation index value of 0.649 specified significant differences among the individuals used in the present study. The per cent variation found among the sub-populations was 9.34 and within subpopulations was 90.66 per cent (Table 3).

Statistics of SSR Genetic Marker
Around twenty-three SSR markers were used for genotyping in which twenty-two were polymorphic with mean value of 0.858 ranged from 0.0 for marker SM104 to 0.968 for EM147. The number of different alleles (Na) among the SSR loci ranged between 1.00 and 7.44 (SM114) with mean of 3.209. The effective number of alleles (Ne) was high about 5.643 (SM114) with the highest Shannon's Information Index (I) of 1.825 (SM114). The Observed Heterozygosity (Ho) ranged from 0.00 to 0.992 (SM128) and whereas the Expected Heterozygosity (He) was about 0.00 to 0.818 (SM114). The fixation index for twenty-three SSR was ranged from -0.362 (SM128) to 1.000 (Table 4).
In the sub-population, POP4 recorded the highest number of distinct alleles (4.17) followed by POP3 (3.91) and POP1 (3.61) while the minimum number of distinct alleles was recorded in POP2 (1.826). Ne value was maximum for the sub-population, POP4 (3.02) and the POP2 chronicled minimal value of 1.78. The Expected heterozygosity (He) values for subpopulations ranged between 0.332 (POP2) and 0.577 (POP3 and POP4). Whereas, observed heterozygosity (Ho) values for sub-populations ranged between 0.185 (POP5) and 0.238 (POP3). The percentage polymorphism observed maximum for POP1, POP3 and POP4 (95.65%) and its least value was observed in the case of POP2 (60.87%) with a mean percentage polymorphism of 85.22 (Table 5).

F Co-efficient and Genetic Distance
The major components of F-statistics are F IS , F IT and F ST : where, 'I' represent individuals, 'S' represents sub-population and 'T' represents the total population. The F IS mean value across all the loci and populations was 0.684. In the present study, the mean F IT value observed was 0.721. However, the observed F ST mean value was 0.238. Among the pair-wise population F ST , the values ranged from a minimum of 0.021 between POP1 and POP3 and a maximum of 0.348 between POP2 and POP5. The pair-wise F ST values indicated the presence of significant genetic differentiation among the sub-populations of eggplant working collection (Table 6). Considering the SSR markers, the F ST values ranged between 0.00 (SM104) and 0.473 (SM127) indicating the presence of significant genetic differentiation among the accessions of eggplant for individual SSR loci studied (Table 4). Pair-wise genetic distance across sub-populations ranged from 0.053 (POP1 and POP3) to 1.03 (POP2 and POP5) ( Table 7).

Principal Coordinate Analysis
The Principal Coordinate Analysis (PCA) was done to understand the relationships between the sampled accessions based on genetic distance. The relative position of the fifty-four genotypes is illustrated in Figure 2. However, there was no distinct categorization of eggplant genotypes were observed except the genotypes of POP5 (Wild/related species of eggplant) are situated in quadrant C. Whereas, in Figure 1 it was observed that the sub-population POP5

Phylogenetic Analysis
The phylogenetic analysis for 54 accessions of eggplant across five sub-populations resulted in the formation of five major clusters (Figure 4). The accessions from POP4, POP3, POP1 sub-populations are dispersed across all major clusters (all 5 major clusters), possibly due to a high rate of gene flow or genetic drift. Cluster 1 included the individuals of POP1, POP2, POP3 and POP4. The second cluster consisted of individuals of POP1, POP3 and POP4 whereas, cluster 3 entailed POP1, POP3 and POP4. Cluster 4 comprised of POP1, POP3 and POP4 though cluster 5 entailed of POP3, POP4 and POP5. The individual accessions of POP5 which consisted of wild species of eggplant have formed a separate sub cluster under the major cluster 5.

DISCUSSION
Eggplant is an important Solanaceous vegetable which has luxurious genetic diversity for various traits including drought tolerance (Jellan et al., 2016). This was manifested from the results of genetic diversity analysis of 54 accessions of eggplant in the present study.  The mean Nm value (1.026) indicated an overall number of migrants and possible divergence among the microsatellite loci used in the working collection of the eggplant. The maximum expected and observed heterozygosity among subpopulations is also one of the parameters which indicated that sub-populations were genetically diverse. These parameters provide the basis for occurrence of vast genetic divergence in eggplant for drought tolerance. Similarly, the significant  1.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.00 0.000 0.000 0.00 0.00 0.00 0.00 SM107

s Information Index (I), Observed Heterozygosity (Ho), Expected Heterozygosity (He), number of migrants (Nm), Polymorphic Information Content (PIC) value, Fixation Index (F) and F-statistics across populations of eggplant
3 Percent polymorphism observed across the 23 SSR markers for five sub-populations represents the amount of diversity present in intra-population and inter-population. The PIC values obtained in the present study were ranged from 0.0 for marker SM104 to 0.968 for EM147, with a mean PIC value of 0.85 ( Figure 5). The PIC value has shown to be influenced by the occurrence of variants per locus as well as relative distribution of the alleles (Botstein et al., 1980). The maximum range of PIC value indicates that the markers were quite informative showing high replicability and reliability, covering the entire genome, evenly distributed on chromosomes and having high potentiality for multiplexing and high throughput genotyping. Similarly, mean PIC values of 0.574 and 0.50 were obtained by Hurtado et al. (2012) and Vilanova et al. (2014) while assessing 52 and 19 accessions, respectively.
The significant mean values, F IS (0.684), F IT (0.721) and F ST (0.238) for five sub-populations of eggplant indicated the level of clustering across the sub-populations which, clearly revealed the existence of distinct genetic clustering at each hierarchy of the population i.e., individual, sub-population and total population. In quantitative genetics, F-statistics describe the    expanse and organisation of genetic diversity was indicated by allelic frequency (Na, Nm, Ne, He and Ho) found in eggplant (Tumbilen et al., 2011;Ge et al., 2013;Vilanova et al., 2014).
statistically expected level of heterozygosity in a population; more specifically the expected degree of a reduction in heterozygosity when compared to Hardy-Weinberg expectation.
Pair-wise genetic distance across sub-populations indicated the presence of significant genetic differentiation among the populations than among the individuals within a population which may have occurred due to genetic contamination of subpopulations. The genetic distance of eggplant population was illuminated by Behera et al. (2005) and Gramazio et al. (2017) where, genetic distance between sub-populations more than pair-wise F ST value indicated different populations of eggplant being more genetically diverse than individuals within a population.
In eggplant genetic diversity analysis, PCA uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables (principal components). The Principal Coordinate Analysis obtained in the present study separated the five sub-populations of eggplant as well as 54 individuals of five sub-populations which is possibly due to high degree of diversity between different populations than within a particular population ( Figure 2) as that of Peakall and Smouse, 2012. Similar results for PCA analysis were obtained by Hurtado et al. (2012) in which, eggplant accessions of the different origins were admixed from China and Sri Lanka were mostly distributed in different areas of the plot (Hurtado et al., 2012).
Phylogenetic analysis clearly distinguishes a genetically distinct individual or a population from the rest (Yang et al., 2015). In Solanum species, analysis of the phylogeny of populations is an important step in understanding the relationships among the populations and their evolution.
Based on the results of Neighbour joining tree method, two major clusters on the basis of genetic similarity were formed for five eggplant sub-populations wherein, Cluster 1 includes POP1, POP4, POP2 and POP3. On the other point, the cluster C2 entailed of sub-population POP5 which, show clear dissimilarity between the sub-populations of two clusters. Further, the accessions from POP4, POP3, POP1 sub-populations are dispersed across all major clusters (Figure 4), possibly due to high rate of gene flow or genetic drift. The individual accessions of POP5 which consisted of wild species of eggplant have formed a separate sub cluster under the major Cluster 5. The admixture of different individuals of five sub-populations of eggplant was possibly due to drastic effect of genetic flow or genetic drift. Similarly, these types of genetically admixed clusters were obtained by Demir et al. (2010) and Jellan et al. (2016) during the phylogenetic assessment of twenty eggplant genotypes each.
The high degree of genetic variation within populations could be due to high rate of gene flow. Further, the contribution of allelic variation to the difference between populations is less compared to their contribution to the variation within population. Eggplant is modified with different stylate of flower that leads to crosspollination in nature which expectedly, may result in genetic contamination of eggplant accessions.

CONCLUSION
The eggplant or brinjal which is a most acceptable and commercially grown solanaceous vegetable after tomato, potato and chilli. Due to high abiotic as well as biotic stress incidence, the productivity of crop decreasing day by day. In order to enhance the yield potentiality of crop breeder needs to study the presence of genetic variation and diversity in a gene pool.
There is a need of broaden the genetic base for development of adversely acclimatized eggplant hybrids for Indian seed markets. The obtained analytical results can be utilised for grouping of individual based on genetic similarities and dissimilarities. Such information aids in the selection of diverse parents for obtaining superior allele combinations in the hybrid or varietal development programmes.

Disclosure Statement
No potential conflict of interest was reported by the authors.