The tree genus Sinowilsonia Hemsl. is a member of the Hamamelidaceae family and comprises only one species, S. henryi Hemsl. This species is narrowly distributed in the mountains of central China at an elevation of 600–1400 m (Zhang et al., 2003). Currently, the natural habitats of this species are severely deteriorated and fragmented, with population sizes ranging from as few as five individuals to approximately 50 flowering plants (Zhou et al., 2014). Thus, S. henryi has been listed as an endangered plant species in the China Plant Red Data Book (Fu and Jin, 1992).
Knowledge of genetic diversity and genetic structure of extant populations is essential to the formulation of effective conservation and management strategies for threatened species (Frankham et al., 2002). Due to their codominance, hypervariability, and reliable scorability, microsatellite markers have been widely used in population genetic studies (Selkoe and Toonen, 2006). However, microsatellite markers for S. henryi are currently not available. High-throughput RNA sequencing (RNA-Seq) is one of the most useful next-generation sequencing techniques for identifying microsatellites. In the current study, we developed and characterized 21 expressed sequence tag-simple sequence repeat (EST-SSR) markers for S. henryi using RNA-Seq.
METHODS AND RESULTS
Total RNAs were isolated from young leaves using a cetyltrimethylammonium bromide (CTAB) procedure (Chang et al., 1993). The poly(A)+ RNA (mRNA) was purified with the RNA Clean-up Kit (Invitrogen, Carlsbad, California, USA) according to the manufacturer's instructions. The purified RNA was subsequently fragmented into small pieces (200 bp) by the fragmentation buffer. Then, the cleaved RNA fragments were used for first-strand cDNA synthesis using reverse transcriptase (Invitrogen) with random hexamer primers. Subsequently, second-strand cDNA was synthesized using RNase H and DNA polymerase I (Tiangen, Beijing, China). Illumina paired-end sequencing adapters were then ligated to the ends of the 3′-adenylated cDNA fragments. The cDNA library was sequenced by Shanghai Haiyu Biotechnology Co. Ltd. on the Illumina HiSeq 2000 instrument (Illumina, San Diego, California, USA). Before assembly, raw reads were filtered to remove those containing adapter or low-quality reads (more than 20% of nucleotides with Q-value ≤ 10) and reads containing poly N (>10% ambiguous base calls). Transcriptome assembly was performed using the Trinity package (version 2013-02-25) with the default parameters (Grabherr et al., 2011).
A total of 28.7 million 300-bp, clean, paired-end reads were obtained. All clean reads are available from the National Center for Biotechnology Information (NCBI) Short Read Archive (SRA) database (Bioproject accession no. PRJNA394173). De novo assembly of clean reads resulted in 64,694 unique sequences with an average length of 601 bp and an N50 length of 999 bp. The MIcroSAtellite identification tool (MISA; Thiel et al., 2003) was used to screen for the presence of microsatellites. The parameters used to identify microsatellites were seven repeats for di-, five for tri- and tetra-, four for penta-, and three for hexanucleotide repeats. Subsequently, SSR primers were designed with minimum GC content of 40% and an expected product size ranging from 100 to 280 bp using Primer3 (Rozen and Skaletsky, 1999).
A total of 8892 SSRs containing repeats from di- to pentanucleotides were identified from 64,694 unique sequences. Dinucleotides were the most abundant repeat type (5232), followed by trinucleotides (2198), hexanucleotides (1035), pentanucleotides (259), and tetranucleotides (168). The dinucleotide repeat (AG/ CT)n (3646) was followed by (AT/AT)n (1192), (AC/GT)n (384), and (CG/CG)n (11). Among the trinucleotide repeat motifs, the most frequent SSR motif was AAG/CTT (667), followed by AAT/ATT (314), AGC/CTG (301), and ATC/ATG (252) (Table 1). Of the 8892 identified SSRs, 2941 (33%) were suitable for designing locus-specific primers ( Appendix S1 (apps.1700080_S1.xls)).
Table 1.
Frequency of repeat motifs in nonredundant Sinowilsonia henryi ESTs.
SSR loci with a minimum of 10 repeats for dinucleotides and seven for trinucleotides were selected for amplification. A total of 121 primer pairs were selected and used for further characterization. Eight individuals of S. henryi from Wuhan Botanical Garden, China, were collected to initially assess microsatellite polymorphism. Genomic DNA was isolated using the CTAB method (Doyle and Doyle, 1987). PCR reactions were performed in a 10-µL reaction mixture (final volume) containing approximately 50 ng of genomic DNA, 0.2 µM each of forward and reverse primer, 10 mM Tris-HCl (pH 8.4), 50 mM (NH4)2SO4, 1.5 mM MgCl2, 0.2 mM dNTPs, and 1 unit Taq polymerase (Fermentas, Vilnius, Lithuania). The PCR cycling program included 5 min of initial denaturation at 94°C; followed by 35 cycles of 50 s at 94°C, 50 s at 56–60°C depending on the primer pair (Table 2), and 1 min at 72°C; followed by a final 10-min extension step at 72°C. The PCR products were separated on a 6% polyacrylamide denaturing gel of high resolution with silver stain. A 25-bp marker ladder (Promega Corporation, Madison, Wisconsin, USA) was used to identify the alleles.
Of the 121 primer pairs tested, 21 successfully amplified the target fragments (Table 1); of these, 13 loci were polymorphic (SH01-SH13), while eight were detected as monomorphic (SH14-SH21; Table 2). The level of genetic variability was estimated by genotyping 72 individuals of S. henryi from three wild populations (Appendix 1). For each locus, the number of alleles (A), observed heterozygosity (Ho), and expected heterozygosity (He) were estimated using the program GENEPOP version 3.4 (Raymond and Rousset, 1995). Null alleles were detected at three loci (SH03, SH04, and SH07) using the program CERVUS 2.0 (Marshall et al., 1998). In the SNJ population. A ranged from one to three, He ranged from 0 to 0.60, and Ho ranged from 0 to 1.00. In the FS population, A ranged from one to three, He ranged from 0 to 0.66, and Ho ranged from 0 to 0.80. In the WD population, A ranged from one to four, He ranged from 0 to 0.63, and Ho ranged from 0 to 0.63. Three loci deviated from Hardy– Weinberg equilibrium after correction for multiple tests (Table 3). The observed departures from Hardy–Weinberg equilibrium may be due to null alleles. Significant linkage disequilibrium was observed in 10 pairs of loci before correction for multiple tests (P < 0.05). However, no loci were observed to be in linkage disequilibrium after correction for multiple tests (P < 0.0006). The sequences containing microsatellites were BLASTed against the NCBI nonredundant protein database using BLASTX with a threshold of E-value < 2.00E-5. Ten loci showed significant similarities to known proteins in the NCBI nonredundant protein database (Table 2).
CONCLUSIONS
In the current study, a total of 2941 primer pairs were successfully designed based on transcriptome sequences. In total, 121 PCR primers of SSR loci were used for validation of amplification and polymorphism; of these, 13 revealed microsatellite polymorphism. To the best of our knowledge, this is the first study to develop microsatellites for S. henryi. These EST-derived SSRs could provide valuable tools for studying genetic diversity and assessing the mating system among Sinowilsonia species. In addition, because EST-derived SSRs may be associated with functional genes, the remaining untested 2820 SSRs and 21 loci developed in the current study may be useful for examining adaptive variation using genome scan methods.
Table 2.
Characterization of 21 EST-SSR primers developed in Sinowilsonia henryi.
Table 3.
Genetic diversity of 13 SSR loci in three populations of Sinowilsonia henryi. a
ACKNOWLEDGMENTS
The authors thank Xiao-Peng Li, Qi-Gang Ye, and Ping Tang for their assistance and advice. This work was supported by the Natural Scientific Foundation of China (grant no. 31400476).