To date, many different strategies for obtaining microsatellite DNA loci or simple sequence repeat (SSR) libraries have been described. The basic approach typically involves digestion, hybridization, cloning, and sequencing, with several variations (Zane et al., 2002; Kalia et al., 2011). The most recent trend in SSR library development is the use of next-generation sequencing (NGS) or pyrosequencing (454 Life Sciences, Roche Diagnostics, Indianapolis, Indiana, USA) technologies (Zalapa et al., 2012). Although pyrosequencing offers the potential to rapidly sequence whole genomes, it is costly (e.g., price >$5000 in Zalapa et al. [2012]). A real reduction in cost is possible with the new Illumina technology (Synthesis Bridge PCR; e.g., Illumina, San Diego, California, USA) (Zalapa et al., 2012), but the best improvements in cost will be possible with the advent of new pH-change sequencing, such as Ion Proton technology (Life Technologies, Paisley, Renfrewshire, United Kingdom). A recent review (Glenn, 2011) summarizes the major characteristics of each commercially available platform to enable direct comparisons. However, even if these new technologies present a greater cost-savings in the long run because they yield many more SSR loci, SSR libraries with a moderate number of loci, which can be produced using standard procedures, can also be useful. However, the majority of these protocols are not cost competitive, and their low efficiency or lack of optimization can restrict their efficacy (Squirrell et al., 2003).
In this study, we optimized an inexpensive protocol that greatly reduces the cost and number of steps while using firstgeneration sequencing (FGS; Sanger sequencing) technology. To establish this optimized protocol, we drew from classic enrichment protocols (e.g., Kandpal et al., 1994; Edwards et al., 1996; Hamilton et al., 1999; Glenn and Schable, 2005; Techen et al., 2010) as well as the new generation of hybrid enrichment protocols, such as Fast Isolation by AFLP of Sequences Containing repeats (FIASCO) (Zane et al., 2002). We have called this protocol “SSR-patchwork” because it is a mixture of the best parts of previous SSR protocols and several improvements that are fundamental to the final yield of the SSR library (see Methods and Results and Appendix S1 (APPS_12-00158_AppendixS1_20Dec2012.pdf)). Unlike other published SSR protocols, this protocol does not require further optimization by the reader, saving considerable time and money.
METHODS AND RESULTS
Three different systems were chosen: two with very different genome sizes and a third with an undetermined genome size. Of the three systems chosen, two are angiosperme, and the third is a red algae. Kochia saxicola Guss. (Amaranthaceae) has an undetermined genome size, Pancratium maritimum L. (Amaryllidaceae) has a genome of approximately 30 000 Mbp (Zonneveld et al., 2005), and Galdieria sulphuraria (Galdieri) Merola (Cyanidiaceae) has a genome of approximately 10 Mbp (Muravenko et al., 2001) (Appendix 1).
Details of the different steps of the protocol are accessible in Appendix S1 (APPS_12-00158_AppendixS1_20Dec2012.pdf). We briefly explain the most important steps here; a relative timeline of the procedure is shown in Fig. 1.
Step 1: DNA extraction and quantification —A total of 2 µg nondegraded DNA from a fresh sample was used for each species (Kochia saxicola, Galdieria sulphuraria, and Pancratium maritimum). The Doyle and Doyle (1990) method was used to produce high-quality genomic DNA, and an RNase step is recommended to improve the cleanliness of the sample. It is imperative to check the concentration and especially the quality of the obtained DNA before proceeding to the other steps. An agarose gel evaluation is sufficient to check the DNA quality (i.e., nondegraded) and concentration using a suitable marker (e.g., Marker II; AppliChem GmbH, Darmstadt, Germany).
Step 2: Restriction enzyme digestion —The EcoRI and MseI enzymes were used, as in the amplified fragment length polymorphism (AFLP) procedure (Vos et al., 1995) (Invitrogen, Life Technologies, Paisley, Renfrewshire, United Kingdom). A successful reaction should yield a smear of fragments ranging from 200 to 1000 bp (Fig. 2).
Step 3: Size selection, gel extraction, and purification —After precipitation of the digested samples, the DNA was loaded onto a 1% agarose gel to separate the DNA fragments. The DNA from 250 to 500 bp was then isolated. Although several kits and protocols for gel purification are available, we propose a simple and economical method that does not require a low melting agarose gel to perform this step (see Appendix S1 (APPS_12-00158_AppendixS1_20Dec2012.pdf)).
Step 4: Adapter preparation and ligation —The EcoRI-adapter and MseI-adapter (Macrogen, Seoul, Korea) were prepared using a touchdown/hold PCR. A ligation was then performed according to the standard protocol recommended by the manufacturer (Invitrogen, Life Technologies).
Step 5: First enrichment —The ligation reaction product was amplified using modified AFLP adapter-specific primers (Macrogen) (without selective terminal base, Pre_EcoI-0 and Pre_MseI-0) (Vos et al., 1995). It is important to note that T4 DNA ligase (Invitrogen, Life Technologies) only ligates one of the strands of the adapter to the digested DNA fragment, while the other is held to the first adapter strand by base pairing. Thus, the first PCR reaction step is a hold at 72°C, which allows the Taq DNA polymerase to ligate the other strand. Several tests can be performed to optimize the ligation pattern (Fig. 3). The best amplification in our study was achieved at 26 cycles and using 2.5 or 5 µL of the ligation template.
Step 6: Preparation of the biotinylated oligo-repeat and hybridization — Several microsatellite motifs can be used (e.g., TG12, GA12, GAG10, CAA10, or AAGT8). Here, for example, the GA12 motif repeat was employed. For the hybridization reaction, 500 ng of oligo-repeat biotinylated probe was used (Macrogen) for 250 ng of enrichment. The reaction was performed entirely by PCR. It is possible to perform this step with a mixture of microsatellite motifs that have the same melting temperature to increase the variety of the hybridization products.
Based on the biotin binding results, it is preferable for the biotin to be in the 3′ position of the oligo-repeat. The 3′ position is preferable to the 5′ position because the biotinylating reaction occurs more efficiently at this position (i.e., more DNA molecules incorporate the biotin when 3′ biotin is used). The labeled oligos can be purified using either the standard desalting method or an additional purification step (e.g., via high-performance liquid chromatography [HPLC]). If the standard desalting method is used, the solution will contain an excess of free biotin molecules that must be taken into account in the next step (Avidin calculation).
It is very important to perform the hybridization reaction with a PCR designed to have an initial denaturing step and a gradual touchdown near the probe's melting temperature (T m) (-2°C). It is ideal to calculate the real T m, taking into consideration the salt (Na+) concentration present in the hybridization buffer. Several Tm calculators are freely available online (e.g., http://www.basic.northwestern.edu/biotools/OligoCalc.html).
Step 7: Preparation and VETREX Avidin D capture —Vectrex Avidin D (Vector Laboratories, Burlingame, California, USA) was employed to capture the hybridization mixture. For a biotinylated oligo-repeat purified by desalting, ∼40 µL of Vectrex Avidin D is required, i.e., about twice the quantity recommended by the manufacturer (binding capacity = 25 ng biotin/µL Vectrex Avidin D).
Step 8: Second enrichment and cloning —Triplicate PCR reactions were performed to amplify the selected DNA fragments to a final concentration of roughly >10 ng/µL. The amplified template was cloned after DNA purification/ concentration, according to the manufacturer's protocol. Several cloning kits are available. We have tested both the Clone Jet PCR cloning kit (Fermentas, Thermo Fisher Scientific, Waltham, Massachusetts, USA) and the PMosBlue blunt-ended cloning kit (GE Healthcare Europe GmbH, Vienna, Austria), which provided comparable results. No differences were observed, except for the inclusion of competent cells in the second kit.
Step 9: Colony screening and sequencing —The amplified colonies were sequenced directly by the modified Sanger method using BigDye Terminator Cycle Sequencing Kit version 3.1 (Applied Biosystems, Life Technologies, Paisley, Renfrewshire, United Kingdom) and a 3130 Genetic Analyzer (Applied Biosystems, Life Technologies). A modified protocol for the optimization of sequencing reagents is reported in Appendix S1 (APPS_12-00158_AppendixS1_20Dec2012.pdf).
Microsatellites were defined considering the minimum repeat unit as six for di- and five for tri-, tetra-, penta-, and hexanucleotides. The frequency of microsatellites in the sequenced colonies ranged from approximately 20–32% in P. maritimum to 58–71% in G. sulphuraria and 42–55% in K. saxicola. Approximately 80–84% of the microsatellites obtained are usable for primer design. The types and frequencies of the repeats obtained using the GA12 motif repeat in the hybridization reaction are reported in Table 1. According to these results, the efficiency of this method differed among the different species studied.
DISCUSSION
An essential prerequisite for developing an SSR library protocol is a knowledge of the type of genome being studied; there are many differences between animal and plant genomes as well as among plant species. Plants have a lower proportion of SSR sequences than vertebrates and a higher proportion of SSR sequences than both fungi and invertebrates (Toth et al., 2000; Morgante et al., 2002). The variety of plant SSR frequencies is correlated with the variation of the amount of single/low copy DNA and nonrepetitive DNA (e.g., retrotransposons), which is widely represented. Unlike animals, plants show an extreme variation in genome size, and genomes are generally larger, especially because of the large amounts of repetitive DNA (San Miguel et al., 1998; Morgante et al., 2002).
TABLE 1.
Information regarding the microsatellites obtained from the GA-enriched library through the sequencing of 130 colonies in Pancratium maritimum and 60 colonies in Kochia saxicola and Galdieria sulphuraria.
In this study, we describe a simple and optimized protocol to expedite the production of highly enriched SSR libraries using small fragments of plant genomic DNA with different genome sizes. Data from the current literature enabled the improvement of this procedure (see Introduction). We improved various steps, such as restriction enzyme digestion, hybridization reaction, cloning efficiency, and other smaller modifications with the aim of obtaining a better efficiency : cost ratio. A comparison with two previously published FGS-enrichment protocols is shown in Fig. 4. The two SSR protocols selected for comparison are those of Edwards et al. (1996), which was developed for plants (barley, maize, rhododendron, sunflower, sugar beet, wheat, and willow), and FIASCO (Zane et al., 2002), which has been used for animals (rock sparrow, gilt head bream, American anglerfish, horned krill, and red coral). Both procedures reported high yields (>50%), although we were unable to reproduce them, especially for the Edwards et al. (1996) (<2%) protocol, without colony hybridization screening. In addition, the selective hybridization step of this protocol was very time consuming (this was also confirmed by Zane et al., 2002).
The most crucial improvements of our protocol compared with Edwards et al. (1996) and Zane et al. (2002) are discussed below. First, the choice of restriction enzymes was the most crucial step influencing the final yield. Very frequently, the enzymes indicated in the protocols can be changed in the event of inefficient digestion. We tested the enzyme used in the Edwards protocol (RsaI, a four-base cutter) using our three genome templates without good results (i.e., inefficient or partial digestion), as also reported by Fischer and Bachmann (1998) and King et al. (2008). To reduce the time required for protocol setup, a good strategy is to use enzymes known to be good cutters on a variety of templates, such as those employed in AFLP, i.e., EcoRI (a six-base cutter) and Msel (a four-base cutter) (Vos et al., 1995). The FIASCO procedure uses only MseI. In our SSR-patchwork protocol, we preferred to use the classic and well-tested pair EcoRI + MseI. Furthermore, the obtained genome fragments (MseI-MseI, EcoRI-MseI, and EcoRI-EcoRI) were predominantly in the appropriate size range for the next steps (i.e., amplification and cloning).
Another key step of the SSR-patchwork protocol was the selection of small digested DNA fragments (250–500 bp) through size selection, gel extraction, and purification (Fig. 2), which is ideal for successful cloning using an inexpensive kit. In contrast, the FIASCO protocol recommends a very expensive cloning kit because it produces larger DNA fragments (200–1000 bp).
Neither Edwards et al. (1996) or FIASCO report details about the importance of annealing temperature in the hybridization reaction. In fact, as discussed above, optimization of the selective hybridization step was very time-consuming, according to Edwards et al. (1996). In this protocol, the hybridization temperature was very low (37°C for 24 h), producing a high level of nonspecific signal, while in FIASCO, the DNA is hybridized according to the protocol first published online by Travis Glenn. Unfortunately, this protocol is no longer available online, but the official SSR protocol published by the author (Glenn and Schable, 2005) used a moderate temperature (50°C for 10 min), emphasizing the importance of optimizing the annealing temperature for the probe.
In addition, several new “tricks” are implemented in the SSR-patchwork protocol compared with previous SSR protocols, e.g.: (1) an initial extension step in the first enrichment amplification (step 5) to fill the nicks present in the ligase reaction products (step 4); (2) the use of Vectrex Avidin D (Vector Laboratories) vs. streptavidin-coated beads in FIASCO, allowing the use of a normal centrifuge rather than magnetic field for the capture of the hybridization mixture (step 7); (3) the effective cost of the Sanger sequencing is very low (step 9) because the optimization is performed using a BigDye Terminator Cycle Sequencing Kit (Applied Biosystems, Life Technologies); and (4) the very detailed/ optimized protocol provided ( Appendix S1 (APPS_12-00158_AppendixS1_20Dec2012.pdf)) greatly reduces both the time required for procedure setup and costs.
Finally, the advantage of the SSR-patchwork protocol over existing techniques in time and cost is illustrated in Fig. 4.
CONCLUSIONS
The SSR-patchwork protocol presented here is simple, fast, inexpensive, does not require complicated experimental steps (Fig. 4), and is effective for both small and large genomes (Table 1). The general nature of this protocol makes it suitable for microsatellite library construction in a large variety of plant or animal taxa with a variety of genome sizes.
LITERATURE CITED
Notes
[1] Special thanks are given to Dr. T. C. Glenn (University of Georgia, Athens, Georgia, USA) for his advice. The authors are also grateful to Dr. C. Ciniglia (Department of Environmental, Biological and Pharmaceutical Sciences and Technology, University of Naples II, Caserta, Italy) for providing the Galdieria strain, Dr. M. Iovinella (Department of Biology, University of Naples Federico II, Naples, Italy) for technical support about the algal data, L. Paino for sequencing services (Department of Biology, University of Naples Federico II, Naples, Italy), and Dr. J. E. Mickle for the critical evaluation of the manuscript (Department of Botany, North Carolina State University, Raleigh, North Carolina, USA). We would also like to acknowledge the anonymous reviewers for their helpful suggestions, which greatly improved the quality of the paper.
Appendices
APPENDIX 1.
Voucher information for taxa in this study. Voucher information and algal strain codes were deposited at the Department of Biology, University of Naples Federico II (Italy). Information presented: taxon, geographical locality with GPS coordinates, voucher information-herbarium, and/or strain accession code—Algal Strains Collection.