Rhododendron longipedicellatum is an endangered species endemic to southeastern Yunnan, China. Assessment of genetic variation is critical for protecting endangered species. Therefore, we used EST-SSR markers to analyze the genetic characteristics of R. longipedicellatum. The results revealed high genetic diversity at the species level (He = 0.559, NA = 9.529) and within populations (He = 0.507, NA = 5.910) and moderate genetic differentiation between populations (FST = 0.083). In addition, more genetic variation existed within populations (91.25%) compared with variation among populations (8.75%). The STRUCTURE analysis showed that 150 individuals from five existing populations could best be divided into two genetic groups. At the population level, the neighbor-joining (NJ) tree and principal coordinates analysis (PCoA) analyses also divided them into two groups. In addition, Bottleneck analyses using the Two-Phase Model (TPM) and Stepwise Mutation Model (SMM) as well as the Garza-Williamson Index revealed widespread signatures of bottleneck events. These results provide vital information for scientifically formulating conservation strategies for the endangered R. longipedicellatum.
Introduction
Genetic diversity is a key component of biodiversity. The study of genetic diversity and population structure of species is the theoretical basis for scientific and rational utilization of germplasm resources and improvement of genetic characteristics, as well as the prerequisite for effective conservation measures (Frankham, 2003; Mable, 2019). With the rapid development and continuous improvement of techniques in molecular biology, molecular markers have been widely used for population genetic diversity analysis, breeding programs, and evolutionary research (Ma et al., 2021; Yadav et al., 2019). Among the molecular marker types, simple sequence repeat (SSR) markers, including genomic SSR (gSSR) and expressed sequence tag SSR (EST-SSR), are highly regarded because of their rich content distribution, high polymorphism, ease of use, and repeatability (Liu et al., 2019; Wei et al., 2011). Compared with gSSR, EST-SSR have a higher level of transferability to related species, are more evolutionarily conserved, have lower development costs, and closer relationships with genes of known functions (Singh et al., 2013; Zhang et al., 2017).
Rhododendron is the largest genus in Ericaceae, with the widest distribution in the Northern Hemisphere, consisting of eight subgenera and more than 1025 species. China has about 581 species from six subgenera, of which 421 species are endemic to China (Cai et al., 2016; Ma et al., 2017). China is the most important modern distribution and evolution center of Rhododendron species, but the loss and destruction of wild resources has become a serious threat in recent years, and many endemic and rare endangered species are extinct or on the verge of extinction (Ma et al., 2014). Although much research has been conducted on the genetic structure and population dynamics of Rhododendron, few studies have been conducted on endangered species whose habitats are seriously threatened. Thus, conservation research on endemic and endangered Rhododendron species is urgently needed.
Rhododendron longipedicellatum is endemic to Southeast Yunnan, China (Cai et al., 2016). Its beautiful tree shape, bright pure yellow flowers, and unusual flowering time give this species a high value for development and utilization in landscaping. Since 2014, our study group has found only five wild populations with less than 1000 plants remaining in limestone habitats in Malipo County, Yunnan Province at an altitude of approximately 1183–1316 m. Due to the proximity to human settlements and high ornamental value, additional habitat loss and removal of wild plants is a relatively serious threat to conserving these remaining populations. Furthermore, within the narrow geographic distribution, the number of populations and individual plants is small, few true seedlings are present within each population, and natural regeneration is weak. At present, the wild, undeveloped habitat available to sustain populations such as ZWL and XL is continuously decreasing ( Supplementary Figure S1). According to the IUCN Red List Categories and Criteria (IUCN, 2001), R. longipedicellatum is already extremely endangered (CR B1ab [i, iii, v]) and requires urgent protection. Thus, this study used 17 previously developed high-polymorphism EST-SSR markers (Li, Liu, Li, Wan, et al., 2018) to study the genetic characteristics of 150 individuals from five residual populations of R. longipedicellatum, attempting to assess the following: (1) genetic diversity and structure; (2) gene flow between populations; (3) correlation between genetic distance and geographical distance; and (4) bottleneck effects. Findings are expected to illustrate the possible genetic consequences of residual small populations in the limestone habitat so as to provide a reference for effective conservation and germplasm resource management of R. longipedicellatum.
Methods
Experimental Materials
Young leaves of 150 healthy mature plants (30 samples from each population) were collected from five residual populations. Locality and voucher information for samples are provided in Supplementary Table S1 and Figure 1. Samples were collected at a distance of at least 2.5 m to avoid collecting clone-mates. The collected leaf tissues were quickly dried with silica gel and stored at −20°C until further use.
DNA Extraction and EST-SSR Genotyping
A modified cetyltrimethylammonium bromide (CTAB) method was used to extract genomic DNA (Doyle & Doyle, 1987). In brief, gel-dried fragments of leaves (0.1 g) were placed in liquid nitrogen, quickly ground to a powder, and then mixed with 1 mL of CTAB buffer (100 mm Tris-HCl, pH 8.0, 20 mm EDTA, 2% CTAB) with 0.1% β-mercaptoethanol added immediately prior to use. The mixture was incubated at 65°C for 60 min and centrifuged for 2 min at 13,000 rpm, after which 600 µL of the supernatant was transferred to a 1.5 mL microcentrifuge tube. The supernatant was mixed twice with phenol/chloroform/isoamyl alcohol (25:24:1) and mixed for 10 min, followed by centrifugation for 10 min at 12,000 rpm. DNA was precipitated from the aqueous phase by adding 0.1 mL of 3 m sodium acetate and 2.5 mL of ethanol. The DNA precipitate was washed twice with 70% ethanol and dissolved in water. DNA quantity was assessed using 0.8% agarose gel electrophoresis ( Supplementary Figure S2).
All samples were genotyped using 17 EST-SSR markers previously developed for this species ( Supplementary Table S1) (Li, Liu, Li, Wan, et al., 2018). For all markers, PCR amplification (three replicates of each sample) was performed on an ABI 3730 Thermal Cycler (Thermo Fisher Scientific, Waltham, Massachusetts, USA). PCR amplifications were performed in a final volume of 10 µL, containing 1 µL (10–30 ng) of template DNA, 5 µL of 0.7× Multiplex PCR Master Mix (QIAGEN, Hilden, Germany), 0.5 µL (10 pM) of each primer, and 3 µL of RNase-free water. The PCR program was as follows: 5 min at 94°C; 30 cycles of 30 s at 94°C, annealing at 58°C for 30 s, and 30 s at 72°C; and a final extension step at 72°C for 5 min. The PCR products were pooled and purified using the Wizard SV Gel and PCR Clean-Up System (Promega), following the manufacturer’s instructions. The 5′ end of each forward primer for the 17 markers was tagged with one of three fluorescent dyes (FAM, HEX, or ROX), and multiplex PCR amplifications were performed for the individuals of R. longipedicellatum using the PCR conditions described above. The fluorescence signal detected by capillary electrophoresis was interpreted using the GeneMapper v4.1. All individual-specific bands corresponding to each primer were read out to establish the original data matrix. Data editing and format conversion were carried out in Convert v1.31 and Genalex v6.2.
Data Analysis
The frequency of null alleles was estimated for each locus in each population using INEST 2.2 (Chybicki & Burczyk, 2009). Genepop v4.2 was used to calculate Linkage disequilibrium (LD) for all locus pairs and Hardy-Weinberg equilibrium (HWE) in each population, with deviations from expected proportions evaluated with a Bonferroni corrected p-value threshold of 0.05 (Rice, 1989). The genetic diversity indices were calculated in PopGene v1.31 (Yeh et al., 1999), including the average number of alleles (NA), the number of effective alleles (NE), Shannon information index (I), observed heterozygosity (Ho), expected heterozygosity (He), inbreeding coefficient (FIS).
We adopted the Markov Chain Monte Carlo (MCMC) method and used STRUCTURE v2.3.4 (Pritchard et al., 2000) to evaluate genetic structure and shared ancestry across the five putative population groups. The analysis was performed using the admixture model of ancestry, which assumes independent origins of each allele, with operating parameter settings “Length of Burnin period” and “Number of MCMC Reps after Burnin” in the running length set to 30,000 and 100,000, respectively. The K value range was set at 1–20, and 20 independent runs were conducted for each K value. The most likely number of clusters was evaluated using both ΔK and the log-likelihood value with STRUCTURE HARVESTER v0.6.94 (Earl & VonHoldt, 2012; Evanno et al., 2005). Following this, cluster analysis based on shared allele distance (DAS) was performed using Population v1.2.30 (Langella, 2007) with the neighbor-joining (NJ) method with 1000 bootstraps. The clustering tree was edited using the FigTree v1.4.0. Based on Nei’s genetic distance (Nei, 1973), principal coordinates analysis (PCoA) of populations was conducted with ape package v5.4 (Paradis & Schliep, 2019). The genetic distance and geographic distance matrices between populations were computed using PopGene v1.31 and Geographic Distance Matrix Generator v1.2.2, respectively, and the Mantel test was carried out with PASSaGE v2 (999 permutations were calculated).
Analysis of molecular variance (AMOVA) was carried out to estimate the partitioning of genetic variation between and within the populations using Arlequin v3.5 (Excoffier & Lischer, 2010). The genetic differentiation between populations was evaluated by calculating the genetic differentiation index (FST; Weir & Cockerham, 1984) using FSTAT v.2.9.3 (Goudet, 2001). Gene flow (Nm) was calculated by the formula Nm = ([1/Fst] − 1)/4 (Slatkin, 1985). The Garza-Williamson index (G-W) was calculated using Arlequin v3.5 (Excoffier & Lischer, 2010). The data were assessed for a population genetic bottleneck signature using the Sign test and Wilcoxon test under the Stepwise Mutation Model (SMM) and Two-Phase Model (TPM) in Bottleneck v1.2 (Cornuet & Luikart, 1997; Piry et al., 1999). The variance for TPM was set to 30, and the proportion of SMM in TPM was 70%, with 1000 iterations. The remaining relevant parameters were set as the software default values.
Results
Genetic Diversity
Using INEST, the frequency of null alleles was estimated to be lower than the threshold of 0.05 at each of the 17 markers across populations (Table 1) (Dakin & Avise, 2004; Y. Li et al., 2019). No consistent genotypic disequilibrium was found between any two loci across all populations; therefore, all loci were used for further analyses. Significant deviation from HWE was only detected in one of the five populations (p < 0.05, after Bonferroni correction; Table 1). At the population level, the five R. longipedicellatum populations showed high genetic diversity (Table 1). The average number of alleles per locus was high (9.529) across all populations, while individual population values ranged from 4.647 to 6.177. The average number of alleles was higher in the pooled population data set because unique alleles were present in some of the populations. Shannon’s information index as a measure of gene diversity was similar across populations, ranging from 1.207 to 0.860. In general, the population XCW contained the greatest level of genetic diversity, and WBL the least.
Table 1.
Genetic Diversity Indices for Five Populations of R. longipedicellatum.
Genetic Structure
The results of the AMOVA analysis of the five populations are presented in Table 2. Of the total genetic variation, 91.25% occurred within populations, and only 8.75% occurred between populations. The genetic differentiation value of 0.083 was within the range of moderate differentiation. The STRUCTURE analysis showed that ΔK reached its maximum value (ΔK = 310.98) at K = 2 ( Supplementary Figure S3), suggesting that the 150 individuals from the five populations most likely belong to two principal genetic clusters (Figure 2(A)). Cluster one mainly included individuals from the WBL, WJL, and XCW populations, whereas cluster two mainly included individuals from the XL and ZWL populations. The STRUCTURE results at K = 3 were consistent with those of the NJ tree (Figure 2(B)), and there were mixed individuals among the populations, indicating some gene flow or differentiation.
Table 2.
Analysis of Molecular Variance (AMOVA) based on 17 EST-SSR Markers for Populations of R. longipedicellatum.
Based on the DAS genetic distance, NJ trees of populations and individuals were constructed, as shown in Figure 3. The genetic relationship between the WJL and XCW populations was close, clustering as a clade, and the XL and ZWL populations clustered into another clade. The WBL population formed a separate clade and belonged to a large branch along with the WJL and XCW populations (Figure 3(A)). The NJ tree of the 150 individuals (Figure 3(B)) revealed some cross-mixing among the three groups and among the populations. The PCoA analysis of the five populations is shown in Figure 4. Approximately 66.01% of the variation occurred between principal coordinates one and two, and their contribution rates were 42.47% and 23.54%, respectively. The results were similar to the NJ tree, and the five populations were roughly divided into two groups. The genetic distance and geographical distance among the five populations were positively correlated (Mantel test, r = 0.77, t = 2.95, p < 0.01). This result is consistent with the NJ tree and two-dimensional PCoA map.
Bottleneck Effect Analysis
The results of the bottleneck analysis for the five residual populations based on the SMM and TPM are shown in Table 3. Assuming that all of the EST-SSR loci conformed to the SMM model, the Sign test and Wilcoxon test detection methods showed that all five populations experienced heterozygote deficit or excess and a certain degree of genetic bottleneck. Based on the TPM model, five populations all experienced heterozygote excess, and the ZWL population showed a significant deviation from mutation-drift equilibrium in Wilcoxon test (p < 0.01, after Bonferroni correction). The G-W values of the five populations ranged from 0.22 to 0.26 with an overall value of 0.26 (Table 3), which was lower than the critical value of 0.68 (Garza & Williamson, 2001) and consistent with a population decline in all five populations. At the species level, the results of the different detection methods consistently indicated a genetic bottleneck.
Table 3.
Bottleneck Analysis for Five Populations of R. longipedicellatum.
Discussion
Genetic Diversity
The evolutionary potential of a species and its ability to withstand adverse environments depend on both the genetic diversity of the species and its population genetic structure (Stebbins, 1950). Different levels of genetic variation and genetic structure of populations can be attributed to multiple factors, such as breeding systems, biological characteristics, evolutionary history, and natural selection (Hamrick & Godt, 1996; Oyundelger et al., 2021; Zhou et al., 2020). The results of this EST-SSR study showed that R. longipedicellatum has high genetic diversity at the species level (He = 0.559, NA = 9.529), and the mean genetic diversity of the five populations investigated (He = 0.507, NA = 5.910) was higher than that of Rhododendron oldhamii (He = 0.284, ranged 0.240–0.336) (Hsieh et al., 2013) and the Rhododendron pseudochrysanthum complex (average He = 0.424) (Chen et al., 2014), but lower than the genetic diversity of Rhododendron jinggangshanicum (He = 0.642) (M. Li et al., 2015) and Rhododendron decorum (He = 0.758) (Wang et al., 2013) based on gSSR markers. Although gSSR markers vary more, possibly due to a higher mutation rate, the variation observed in EST-SSR markers may be more functionally significant and represent responses to selection more effectively (Ellis & Burke, 2007; Kane & Rieseberg, 2007).
Despite the restricted distribution and small population sizes of R. longipedicellatum, this species has retained high levels of genetic diversity, possibly due to the following reasons. The extensive temporal and spatial heterogeneity of the limestone habitat in which these plants occur may contribute to their genetic diversity. The limestone habitat includes variation in light, temperature, rock type, degree of differentiation, fissure soil, karst water, humus layer thickness, and the species composition and density of the vegetation. Some R. longipedicellatum plants occurring in the limestone habitat are attached to tree trunks, providing a stable yet polymorphic microenvironment that reduces the impact of natural and human-induced disturbance. Second, outcrossing plants tend to have higher genetic variation than selfing plants (Huang et al., 2021; Nybom, 2004). The cross-pollination reproductive system of R. longipedicellatum reported by Li, Liu, Li, Ma, et al. (2018) may, therefore, also support the relatively high genetic diversity in this species. The high genetic variability of R. longipedicellatum may also be attributed to abundant diversity in the ancestral population (Zhang et al., 2021) that may have existed as a large, continuous, and widespread species, although little is known about the origin and evolutionary history of R. longipedicellatum. The current restricted distribution and small population sizes of the species result from habitat deterioration and loss, as well as direct human interference. R. longipedicellatum is a perennial woody plant whose biological characteristics may also play a role in the maintenance of genetic diversity. In the five populations of R. longipedicellatum studied, few or no seedlings were observed and the plants usually propagated vegetatively, which tends to maintain genetic diversity.
Genetic Structure
According to AMOVA, 91.25% of the total genetic variation occurred within populations, and only 8.75% was inter-population variation, which is consistent with the observation that most of the genetic variation in woody plants occurs within populations (Liu et al., 2012; X. Liu et al., 2020). In this study, the overall FST was 0.083, indicating moderate differentiation among populations (Jakobsson et al., 2013) due to drift or possibly divergent selection among patchily-distributed populations caused by human disturbance or habitat heterogeneity (Liu et al., 2019; Willi et al., 2006). The STRUCTURE analysis divided 150 individuals from the five existing populations into two genetic populations, and the NJ tree and PCoA analyses also divided the populations into two groups. This clustering of populations may be due to local environmental conditions, which often shape patterns of genetic structure, especially in heterogeneous and fragmented habitats (Young et al., 1996; Zhu et al., 2016). Both at the species and the population levels, FIS was less than 0 and the cross-pollination rate was greater than 1, indicating that outbreeding was present in the population, contrary to the high inbreeding coefficient expected in a small population of plants occurring in a unique habitat and subjected to an increasingly shrinking geographical distribution. The obvious homozygote deficiency or heterozygote excess may be associated with population regeneration in some plants (Stoeckel et al., 2010). In addition, the increased heterozygosity may be due to the small number of populations with few individual plants in each population or to the presence of a high allelic richness in the parent plants (Huang et al., 2021; Pudovkin et al., 1996).
The magnitude of gene flow between populations has a significant influence on the genetic structure of populations and is a key factor that predicts the effect of changes in the environment, such as human interference and population isolation (Wu et al., 2017). In this study, the estimated number of migrants per generation among populations of R. longipedicellatum was >1, indicating that gene flow was high enough to overcome differentiation among populations by genetic drift (Wright, 1949). The diffusion and transmission of pollen and seeds are the two major forms of gene exchange between populations. Similar to the majority of Rhododendron species (Ng & Corlett, 2000; Yang et al., 2020), our field surveys found that the seed transmission capability of R. longipedicellatum was very limited, seedlings in the populations were rare, and the pollen dispersal depended mainly on male Bombus braccatus (Li, Liu, Li, Ma, et al., 2018), with a propagation distance of 3–10 km (Ng & Corlett, 2000). All of the study populations of R. longipedicellatum were geographically close, and even the relatively distant populations, WBL and ZWL, were only about 9 km apart. Thus, pollen flow by pollinators is expected.
Population Bottlenecks
Bottleneck v1.2 identified genetic bottlenecks in R. longipedicellatum using different models and detection methods. The results showed that at the species level, under the assumptions of the SMM and TPM models, R. longipedicellatum significantly deviated from mutation-drift equilibrium, and at the population level, the ZWL population also deviated from equilibrium (p < 0.01, after Bonferroni correction). Recent damage caused by human activities, such as grazing and indiscriminate felling, has increased, especially in the ZWL population. The M-ratio test reflected a loss of allelic diversity in all five populations, consistent with population contraction and supporting the Bottleneck results. As bottlenecks tend to cause reduction the quality of the species gene pool and genetic variability of populations and consequently, their fitness. The high genetic diversity observed in R. longipedicellatum in this study may be due to the occurrence of the genetic bottleneck being relatively recent.
Implications for Conservation
These results indicate that genetic diversity is still present, to a certain degree, in R. longipedicellatum. To preserve this species, an effective conservation measure could be to strengthen the habitat protection of R. longipedicellatum, strictly forbid indiscriminate felling, and protect it from man-made interference. All five populations contained a high level of genetic diversity, with the XCW population having the richest genetic diversity and the highest number of effective alleles. The connectivity and mixing between XCW and other populations were higher than those between XCW and other populations, which indicated that XCW is of great significance in maintaining genetic polymorphism. This population is geographically located in the middle of the other populations and thus beneficial for facilitating dispersal and connecting the other populations as a corridor. Therefore, we suggest that additional conservation efforts be focused on the XCW population. In addition, protection efforts can be enhanced by appealing to the related local authorities, setting up conservation warning signs, and publicizing the value of, and conservation strategy for, this endangered species to local residents.
Acknowledgments
We would like to thank all the funding for their support. We would also like to thank Editage ( www.editage.com) for English language editing. Finally, we are particularly grateful to the anonymous referees for their time and their valuable comments.
Declaration of Conflicting Interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
FundingThe author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Fundamental Research Funds for the Central Non-profit Research Institution of the Chinese Academy of Forestry (CAFYBB2019ZB007); the Biodiversity Survey and Assessment Project of the Ministry of Ecology and Environment, China (2019HJ2096001006); the Ten Thousand Talent Program of Yunnan Province (YNWR-QNBJ-2019-010; YNWR-QNBJ-2018-174) and the Young and middle-aged academic and technical leaders in Yunnan Province reserve talents (2018HB066).
Supplemental materialSupplemental material for this article is available online.