Song Sparrows (Melospiza melodia) provide one of North America's best examples of geographic variation in phenotype, with approximately 26 described subspecies recognized. However, researchers have found inconsistent signals when making comparisons between subspecies and genetic markers. We examined seven microsatellite loci from 576 Song Sparrows of 23 western North American populations representing 13 recognized subspecies. We assessed the level of concordance between microsatellite genotypes and subspecies. We found that in some, but not all, instances neutral genetic structure corresponded to recognized phenotypic structure. However, some populations not currently recognized as subspecies were found to be genetically differentiated from all other populations that are considered to be the same subspecies. We suggest that a combination of phenotypic characters, behavioral traits, and multiple loci be used when assessing geographic variation in birds, and that sampling should be conducted in more than one location within broadly distributed subspecies.
INTRODUCTION
The Song Sparrow (Melospiza melodia) is one of the most geographically variable species in North America, with 26 recognized subspecies (Aldrich 1984, Gibson and Kessel 1997, Patten 2001, Arcese et al. 2002). On the Pacific coast, Song Sparrows are found along a 7000 km stretch of the continent, extending from cool, wet Attu Island in far western Alaska to hot, dry California deserts (Fig. 1). Given these climatic differences, it is perhaps not surprising that 60% of recognized Song Sparrow subspecies (Fig. 1) are described from this coastal region (Arcese et al. 2002). Gabrielson and Lincoln (1951:250), remarking on the degree of variation found in this species along the Pacific slope, stated that "it is probably true that if all the resident Song Sparrows between Kodiak Island and the Imperial Valley in California were suddenly destroyed, there are few observers who would believe that there was any close relationship between the large dusky Aleutian birds and the small pale form about the Salton Sea."
TABLE 1.
Locations, subspecies of Melospiza melodia represented, and sample sizes among 23 populations and 13 subspecies for the 576 Song Sparrows sampled between 1995 and 2002 for genetic analysis (see Fig. 1).
Geographic variation in Song Sparrows has intrigued researchers for decades, and with the advent of molecular techniques, several studies have attempted to use genetic markers to examine this variability. Initially, researchers used mitochondrial (mt) DNA markers to find that Song Sparrows represent a single monophyletic lineage that recently colonized most of its current distribution (Hare and Shields 1992, Zink and Dittmann 1993, Zink and Blackwell 1996, Fry and Zink 1998). However, a general lack of concordance between recognized subspecies and the distribution of mtDNA haplotypes suggests that subspecific differentiation in Song Sparrows arose more rapidly than mtDNA variation could track (Zink and Dittmann 1993). Recently, markers with higher mutation rates (i.e., microsatellites) than mtDNA (Goldstein and Schlötterer 1999, Pereira and Baker 2006) have been used to examine geographic variation in Song Sparrows, with some evidence of concordance between phenotype and genetics in central California (Chan and Arcese 2002, 2003), coastal Alaska and northern British Columbia (Pruett and Winker 2005), and the Salton Sea (Patten et al. 2004).
We report the results of analyses of microsatellite data from 576 Song Sparrows in 23 populations, representing 13 recognized subspecies, to examine relationships between genetic and phenotypic structure across the entire Pacific coast of North America (Fig. 1). Throughout this paper we follow the definition by (Patten and Unitt 2002) of subspecies as a collection of populations in a given geographic range that differ in some fixed way (almost always phenotypically) from other populations but that are not reproductively isolated from one another. For any subspecies whose ranges meet, we expect there to be some level of gene flow; otherwise, we would label the populations as biological species (Mayr and Ashlock 1991). Unlike phylogenetic species concept advocates, who do not recognize subspecies (Zink 2004), we do not seek phylogenetic phenomena such as monophyletic lineages, but rather take a multilocus, population genetics approach to examine the relationships among populations that exhibit recognized phenotypic differentiation.
METHODS
SAMPLING AND MOLECULAR LABORATORY METHODS
Song Sparrows (n = 576; Table 1) were collected or captured and released from 23 populations, and DNA was extracted following methods described by (Keller et al. 2001), Chan and Arcese (2002, 2003), (Patten et al. 2004), and (Pruett and Winker 2005). Seven microsatellite loci were amplified for all individuals using fluorescent dye–labeled primers developed for Song Sparrows (Mme1,Mme2,Mme3,Mme7,Mme12; Jeffery et al. 2001) and for two other bird species (Escμ1; Hanotte et al. 1994; GF2.35; Petren 1998), then genotyped as described in (Keller et al. 2001), Chan and Arcese (2002, 2003), (Patten et al. 2004), and (Pruett and Winker 2005). Because two of the loci (Mme3 and Mme7) are sex linked, we treated females as having one missing allele for these loci in analyses. Song Sparrows with known size fragments (representing scored alleles) were included as controls among studies to ensure that data were concordant. This standardization was achieved by amplifying DNA from individuals that had known allele sizes, including these samples with each run, and scoring alleles based on these known-size fragments. Individuals were sampled only during the breeding season, making it unlikely that nonbreeding transients were sampled.
STATISTICAL ANALYSIS
We performed tests for Hardy-Weinberg equilibrium for all individuals and linkage disequilibrium between pairs of loci as implemented in ARLEQUIN (Schneider et al. 2000). We used MICROCHECKER (van Oosterhout et al. 2004) to test for the presence of null alleles using the methods of (Chakraborty et al. 1992) and (Brookfield 1996), and to determine whether there was evidence of stuttering and large-allele dropout. Genetic differences between populations (FST) and between subspecies were also determined using ARLEQUIN. For the two Z-linked loci (Mme3 and Mme7), we calculated FST separately because sex-linked loci may have different effective population sizes than autosomal loci depending on the sex ratio and variance in family size (Wang 1999). These values were weighted based on the proportion of base pairs found on the sex (1.5%, ~1.5 million base pairs) and autosomal chromosomes in relation to the size of the chicken (Gallus gallus) genome (~1 billion base pairs; International Chicken Genome Sequencing Consortium 2004). Values for autosomal and sex-linked loci were combined for an overall pairwise FST value for each pair of locations. These values were very similar to the unweighted FST estimates. We constructed principal coordinates analysis (PCO) graphs using these pairwise FST estimates (GenAlEx6; Peakall and Smouse 2006). Principal coordinates analysis is a general ordination technique that can incorporate any distance metric (Gaugh 1982), including genetic ones (principal components analysis is a special case of PCO, in which distances are Euclidean). Our ordination graphs are meant only to provide a heuristic display of relationships among sampled populations.
Analysis of molecular variance (AMOVA) was used to assess genetic and subspecific structure using ARLEQUIN (Schneider et al. 2000). We conducted three analyses, with variation partitioned into three components using a hierarchical model: within subspecies, among populations within subspecies, and within populations for: 1) the whole dataset, 2) the Alaska locations, and 3) all other locations. Significance of variance components was tested using 10 000 permutations of the data. Average fixation index (FST) over all loci are reported for each analysis.
A Bayesian clustering approach using STRUCTURE Version 2 (Pritchard et al. 2000, Falush et al. 2003) was used to examine how well predefined populations corresponded to genetic clusters (K). In this analysis, individual genotypes are assigned to clusters such that Hardy-Weinberg equilibrium and linkage equilibrium are achieved within each cluster. A Markov Chain Monte Carlo approach is used to determine the K that is most likely given the observed genotypes. We ran STRUCTURE twice for each user-defined K (from 1 to 20 clusters) with an initial burn-in of 105, followed by 106 further iterations on the total dataset. No prior information was used on the population of origin of each individual. We used the admixture model, in which individuals may have mixed ancestry, and the correlations model, which takes into account that closely related populations may have correlated allele frequencies. When the K with the maximum likelihood value was found, the proportion of membership of each predefined population (e.g., Attu and Adak Islands) within each genetic cluster was determined. To ensure that this was the correct number of clusters, we used the method of (Evanno et al. 2005). We performed 10 runs for each K (8 to 14 clusters) with initial burn-in of 50 000 and 50 000 subsequent iterations for each analysis. We found the same number of clusters as with the original method.
RESULTS
Tests for Hardy-Weinberg equilibrium showed that two loci (Mme1 from Attu Island, and Mme2 from Kodiak Island, Alexander Archipelago, Alaksen, and Reifel) were deficient in heterozygotes for certain locations after adjustments for multiple comparisons. However, we found no evidence for the presence of genotyping artifacts such as null alleles, stuttering, or large-allele dropout at any locus. Thus, all loci were used in our analyses. All loci were in linkage equilibrium.
Principal coordinate analysis plots of population relationships indicated that many populations of Song Sparrows are genetically differentiated from other populations we examined, including many populations in Alaska (Attu Island, Adak Island, Alaska Peninsula, Kodiak Island, and Copper River Delta) and two populations in San Francisco Bay (Palo Alto Baylands and Dumbarton Marsh; Fig. 2A). Populations from other locations clustered together. Principal coordinate analysis plots of subspecies showed a similar pattern, with maxima, sanaka, insignis, caurina, and pusillula being the most differentiated from other subspecies (Fig. 2B).
Differentiation was statistically significant (AMOVA, P < 0.01) among subspecies and among populations within subspecies based on subspecies- and population-level variance in relation to total variance for every dataset (Table 2). Thus, genetic differentiation among subspecies accounted for more of the total variance than that among populations within subspecies, no matter how the data were partitioned (Table 2). Among-subspecies variation was highest for the Alaska grouping, but non-Alaska locations also showed significant subspecific variation. For Alaska, FST (0.19) was higher than that for populations from non-Alaska locations (0.05). Much of the genetic structure in the entire dataset (FST = 0.12) was attributable to birds from Alaska locations.
TABLE 2.
Results of analyses of molecular variance (AMOVA) for 23 populations of western Song Sparrows sampled between 1995 and 2002. Asterisks indicate P < 0.01.
The most likely number of genetic clusters (K) identified by STRUCTURE analysis was 12. Several patterns were apparent from this analysis: 1) some genetic clusters corresponded to recognized subspecies; 2) some clusters corresponded to geographically neighboring populations that are not the same subspecies; and 3) some clusters showed a high proportion of admixture among multiple populations (Table 3). Clusters corresponding to recognized subspecies included cluster 1 for the subspecies maxima, clusters 3 and 7 for morphna, cluster 11 for fallax, and cluster 12 for pusillula (Table 1, 3). Clusters that corresponded weakly to subspecies designation included cluster 4 for rufina and cluster 8 for maxillaris. Clusters encompassing geographically neighboring locations included cluster 2 for Alaska Peninsula and Kodiak Island, and Cluster 10 for Copper River Delta and Hyder (Fig. 1). None of the remaining clusters had a high proportion from any population assigning to it, providing evidence of admixture among populations. In most of these cases, small proportions (<20%) of each population were assigned to the given cluster. Most of these populations were from the San Francisco Bay Region (Table 3, Fig. 1).
TABLE 3.
Proportion of membership of individual Song Sparrows from each predefined population in each genetic cluster from STRUCTURE (Version 2; Pritchard et al. 2000, Falush et al. 2003). Populations that are defined as the same subspecies are in the same box and asterisks indicate the cluster that possesses the highest proportion of genetic membership for each location. For example, the highest proportion of genotypes from Attu and Adak Islands are found in cluster 1. Populations are defined in Table 1.
DISCUSSION
There is significant genetic variation among subspecies in western North American Song Sparrows. Most of this variation is found in Alaska, as would be expected given the geographic isolation of these populations; however, genetic structure is also evident among non-Alaska populations. There are three broad patterns in the relationship between Song Sparrow subspecies and genetics within western North America. These include: 1) subspecies that are concordant with our genetic markers, 2) subspecies that are discordant with our genetic markers, and 3) genetically unique populations that have not been described as separate subspecies.
Two subspecies showed a pattern of genetically distinct groups strongly concordant with subspecies designations: pusillula and maxima. Large proportions of the individuals within the subspecies pusillula (>50%) and maxima (>80%) grouped into single respective genetic clusters (clusters 1 and 12), and the populations within each subspecies also grouped (albeit weakly between the maxima locations) together in the PCO plot.
There was also some concordance between phenotypic subspecies and genetic variation for fallax, morphna, rufina, and maxillaris. These associations included >50% of the fallax population assigning to a single genetic cluster, three of the four morphna populations grouping together, both rufina populations grouping in the same cluster, and maxillaris from two locations clustering together. However, individuals from several of the sampled locations within these subspecies did not show a large proportion of membership in any given cluster, and PCO plots showed that these subspecies and populations within these subspecies are not clearly genetically differentiated from one another.
Populations corresponding to the subspecies sanaka (Alaska Peninsula) and insignis (Kodiak Island) are geographically next to one another and genetically closely related (e.g., clustering together in the STRUCTURE analysis). This same pattern was found for the caurina (Copper River Delta) and merrilli (Hyder) populations. However, the PCO plot of subspecies showed that these subspecies are somewhat genetically differentiated from one another. Thus, there is also some concordance between subspecific designation and genetic identity for these four subspecies. The remaining three subspecies (gouldii, samuelis, and heermanni), found primarily in the San Francisco Bay area, California, showed little to no genetic structure (see also Chan and Arcese 2002, 2003).
The final pattern, genetically unique populations not described as separate subspecies, was found for Attu and Mandarte Islands. Birds from Attu Island were clearly differentiated from birds from all other locations in the PCO plot, being the most divergent of any population examined. Attu Island birds did cluster with those from Adak Island in the STRUCTURE analysis. (Pruett and Winker 2005) examined one additional microsatellite locus for these populations to find that each occurred as a unique genetic cluster, and showed that the addition of a single locus to a dataset can sometimes tease apart recently diverged populations. The Mandarte Island population showed a genetic signal that is the mirror opposite of Attu Island Song Sparrows: little differentiation in the PCO plot but a high proportion of membership in a single genetic cluster. Both populations have experienced significant demographic shifts in size, likely due to the initial colonization of Attu Island (Pruett and Winker 2005) and to the very small average population size, low rate of immigration, and history of severe population bottlenecks on Mandarte Island (Keller et al. 2001, Smith et al. 2006). It is likely that these reductions in population size (bottlenecks) caused substantial changes in allele frequencies that led to rapid genetic differentiation. Furthermore, higher inbreeding levels on Mandarte would generate linkage disequilibrium, which would contribute to genetic cluster-based delineation of that population (Falush et al. 2003).
Ten of the 13 subspecies that we examined showed some level of concordance between genotype and phenotype. Although not all groups showed strong signals of genetic divergence, we advocate continued recognition of all Song Sparrow subspecies examined here, including those showing little association between genotype and phenotype. Studies using a single-locus (mtDNA) approach to examine geographic variation in Song Sparrows found little concordance between recognized subspecies and haplotype distribution (Hare and Shields 1992, Zink and Dittmann 1993, Fry and Zink 1998). But as a larger number of rapidly mutating loci (microsatellites) have been examined, this has provided increased power to detect gene flow and consequently, a higher concordance between phenotype and genetic structure (Chan and Arcese 2002, Patten et al. 2004, this study). We found substantially more variance among subspecies, even within the non-Alaska populations, than has been reported using mtDNA sequence data alone (Fry and Zink 1998). Overall, these findings suggest that phenotypic evolution has occurred relatively recently and that many loci are needed to uncover underlying patterns of genetic divergence among closely related groups, especially those with large effective population sizes. These subspecies differ in multiple mensural and plumage characters, and there appears to be a heritable genetic component to this variation (Smith and Dhondt 1980, Schluter and Smith 1986). It is possible, therefore, that as more loci are examined, a better understanding of the genetic determinants of morphological variation will be discovered and a better resolution of taxonomic units will occur.
We strongly agree with other researchers that the use of a combination of characters (genetic, morphological, and behavioral) will determine whether groups differ enough to be considered unique units in taxonomy or conservation (Crandall et al. 2000, Bulgin et al. 2003, Remsen 2005, Spaulding et al. 2006; Winker et al. 2007). Song Sparrows on Mandarte Island provide a good example of why a single-criterion approach can be misleading and why such data must be considered in the context of biologically meaningful patterns. If only genetic markers were examined and phenotype ignored, then Mandarte Island Song Sparrows could be described as a separate subspecies or conservation unit based solely on what appear to be nonadaptive differences due to drift in a population with very small effective size (Smith et al. 2006). However, the use of genetic markers can aid in identifying populations where further research on phenotype and behavior is needed to assess taxonomic and conservation status, such as among Song Sparrows on Attu Island. We suggest that multiple locations be examined within the ranges of widespread subspecies (as in Song Sparrows) to enable identification of the full scope of geographic variation within species.
This project was supported by the University of Alaska Museum, Universities of Wisconsin and British Columbia, the American Museum of Natural History, the National Geographic Society, the U.S. Department of Agriculture (USDA-ARS), the U.S. National Science Foundation (DEB-9981915, IBN-9458122), Swiss National Science Foundation, Natural Sciences and Engineering Research Council of Canada, Center for Global Change and Arctic System Research, Hildegard and Werner Hesse, and an anonymous donor. We thank Jinliang Wang for suggesting an appropriate way of combining information from autosomal and sex-linked markers, David Donoso for translating the abstract, and two anonymous reviewers for comments on the manuscript.