Over the past 15 yr, our understanding of the microevolutionary processes that shape variation within bryophyte populations has been revolutionized by the use of DNA sequence variation. Most of these inferences have been drawn from variation in a small number of loci, principally from the chloroplast and nuclear ribosomal regions (Stech and Quandt, 2010). However, these loci may be difficult to align, they may lack sufficient variation to answer many questions, and they may not reflect the full complexity of the organismal history (McDaniel et al., 2010; Vanderpoorten and Shaw, 2010).
To develop new loci for phylogeographic and population genetic inference in Ceratodon purpureus (Hedw.) Brid., we have generated primers for exon-primed intron-spanning loci, based on an alignment of expressed sequence tags (ESTs) from C. purpureus to the Physcomitrella patens (Hedw.) Bruch & Schimp. genome. The common ancestor of P. patens and C. purpureus represents the common ancestor of nearly all of the arthrodontous mosses, comprising ∼95% of moss species (Cox et al., 2010). Thus, although we designed these primers specifically for use in C. purpureus and its relatives, by choosing conserved priming sites we have maximized the chance that these loci will amplify homologous regions in other bryophyte species.
METHODS AND RESULTS
To develop primers for nuclear loci in C. purpureus, we screened the 1677 ESTs available on GenBank at the time. We first clustered the ESTs into 850 unigenes, and aligned them to the P. patens genome using the software BLAT (Kent, 2002; http://genome.ucsc.edu/goldenPath/help/blatSpec.html). This resulted in 450 aligned unigenes, or 1050 aligned ESTs. Using the software Primer3 (Rozen and Skaletsky, 2000), we designed pairs of primers that were homologous to the C. purpureus sequence and that spanned a single intron in the P. patens genome. We designed a set of primers with their 3′ end at least 25 bp from the beginning of the intron. This resulted in primers for 212 nuclear loci. On the intron-spanning unigenes that failed the primer design process, we also designed a set of primers with their 3′ ends at least 5 bp from the beginning of the intron. This resulted in primers for an additional 33 nuclear loci (all primer details are in Table 1). In some cases, the unigene spanned multiple introns, and we designed separate pairs of primers for each intron. Where possible, we also designed alternate primers for each intron in the complete unigene set.
To evaluate the full set of 245 loci, we sequenced each of these gene regions in the female laboratory strain GG1 (collected from Gross Gerunds, Austria, by D. J. Cove), the male laboratory strains WT4 (collected in Wispertal, Austria, by E. Hartmann) and R40 (collected by S.F.M. in Rensselaer County, New York, USA), and an isolate from Otavalo, Ecuador (collected by S.F.M.). Live cultures of all of these individuals are available from the authors. DNA was extracted from 7-d-old protonemal grown under standard conditions (Cove et al., 2009) using the Nucleon PhytoPure Genomic DNA Extraction Kit (Amersham Biosciences, Piscataway, New Jersey, USA) following the manufacturer's instructions. PCR was accomplished using GoTaq Green Master Mix (Promega Corporation, Madison, Wisconsin, USA) in 16-µL reactions. The cycling conditions were 94°C for 120 s, then 10 cycles of 94°C for 15 s, an annealing temperature of 65°C that decreased one degree each cycle, and 72°C for 60 s, followed by 20 cycles of 94°C for 15 s, 56°C for 30 s, and 72°C for 60 s. The PCR products were cleaned using the QIAquick PCR Purification Kit (QIAGEN Sciences, Germantown, Maryland, USA). Sequencing used BigDye Terminator version 3.1 chemistry and was accomplished on an ABI 3100 capillary sequencer (Applied Biosystems, Carlsbad, California, USA). Forward and reverse sequence fragments were edited and assembled using Sequencher 4.0 (Gene Codes Corporation, Ann Arbor, Michigan, USA), and all polymorphisms were checked from the chromatograms.
We generated high-quality sequence data for 218 of the 245 loci. We used the software DnaSP (Librado and Rozas, 2009) to estimate the distribution of the per-site genome-wide nucleotide variation (θ, an estimate of Neµ [where Ne is the effective population size and µ is the per-site nucleotide mutation rate]) in C. purpureus (mean: 0.014, median: 0.008, range: 0.0–0.14; Fig. 1, Table 2). Although these data were generated from a modest sample, this stands as the most complete estimate of this fundamental parameter in any bryophyte, and forms a benchmark for further comparisons. It is possible that this estimate of θ is biased upward, by cryptic population structure in our sample, or downward by our small sample size. However, many loci showed no variation among intercontinentally disjunct samples, consistent with previous work (McDaniel and Shaw, 2005), suggesting that the loci that are more diverged reflect locus-specific rather than genome-wide evolutionary processes. For example, loci at the low end of the distribution may be linked to loci that have experienced a selective sweep (McDaniel and Shaw, 2005), while loci on the high end of the distribution may be linked to the sex chromosomes or loci linked to local adaptation (McDaniel et al., 2007, 2008). This degree of variation illustrates the among-locus heterogeneity in evolutionary history within this species. While sampling more individuals would quantitatively improve this estimate, the concordance between this and previous estimates suggests that the median value is unlikely to be qualitatively improved without a much larger sample.
We have identified more than 50 loci with θ = 0.02, a value more than twice the species median. This value is also equivalent to the most variable nuclear loci used for phylogeographic inference in any bryophyte species to date. Using the PCR and sequencing strategy outlined above, we chose 12 loci to sequence in isolates of C. purpureus from the Sierra Nevadas, Spain; Casey Station, Antarctica; and Wollongong, Australia, and 1–2 isolates of the sister groups to C. purpureus, Trichodon cylindricus (Hedw.) Schimp., and Cheilothela chloropus (Brid.) Broth. (Table 2). The PCR products were nearly the same length in all three species, and produced sequences with unambiguous chromatograms. In all cases, the introns were alignable among the three species, but the species differed at ∼10–20% of the intron sites, suggesting that these loci may be useful for phylogeographic and species-level phylogenetic studies. In the complete panel of loci, we also found 23 introns that were present in the P. patens genome that were absent in the C. purpureus genome (Table 2). Using a PCR length variation test, we determined that the intron absence was shared by many species in the Dicranidae (McDaniel and Neubig, unpublished data). These presence/absence polymorphisms may be useful phylogenetic markers (Goffinet et al., 2007). We expect that this panel of primers will be valuable for the bryophyte evolutionary genetics community as a whole.
In this study, we have generated primers for more than 200 loci, based on comparisons from ESTs from C. purpureus and the genome of P. patens. We have used these loci to estimate the genome-wide distribution of nucleotide diversity within C. purpureus. Because these primers were designed to be homologous to exonic regions that are conserved between species that diverged long ago, these primers may amplify the target region in a wide variety of mosses. We anticipate that these loci will form a valuable addition to the bryophyte molecular ecology toolkit, enabling more detailed phylogeographic and population genetic studies of a variety of focal species.
 Funding for this work was provided by a National Institutes of Health National Research Service Award to S.F.M. at Washington University in St. Louis, a Pilot Sequencing Grant from the Washington University Genome Sequencing Center to S.F.M. and R.S.Q., and from start-up funds from the University of Florida to S.F.M.