In order to investigate geographic population structure and genetic diversity in the soybean pod borer Leguminivora glycinivorella, partial sequences of the mitochondrial DNA of 337 individuals from northeastern China were sequenced and analyzed. 16 haplotypes were found in CO II, and 14 haplotypes were defined in Cytb, including one haplotype shared by ten populations in each gene. L. glycinivorella populations are characterized by medium/low haplotype diversity and nucleotide diversity. The Tajima's D and Fu's Fs test indicated that there might not have been a recent population expansion. All pairwise gene flow Nm parameters were greater than one in the 10 populations. Molecular variance analysis (AMOVA) demonstrated that the observed genetic differentiation occurs primarily within populations, rather than among populations, no large-scale regional differences are detected. Genetic distance is not significantly correlated with geographical distance between populations. Maximum likelihood phylogenetic trees and a haplotype network showed that the haplotypes are distributed in different clades and no obvious geographical structure has been formed. The result suggested that geographic population structure among L. glycinivorella are not affected by geographic isolation and recent dispersal (some gene flow) resulting in no significant genetic differentiation occurred among populations.
The soybean pod borer Leguminivora glycinivorella (Mats.) is an important agricultural pest, widely distributed in China, Japan, Korea and Far East coast. In China, it is distributed in the northern and central regions, and, especially, the northeastern three provinces suffer seriously outbreaks. L. glycinivorella is a relatively monophagous pest, damaging grain by larval drill decay (Wu 2002). Its primary hosts are Glycine max, Glycine ussuriensis and Sophora flavescens. High insect feeding rates occurred in China (10%–30%), sometimes exceeding 50%. Serious impacts have been observed on soybean yield and quality (Wu 2001), and this pest is the primary target of prevention and control in soybean fields.
Genetic markers, in particular sequences of mitochondrial DNA (mtDNA), have been proven to be very informative for assessing genetic diversity and gene flow (Avise et al. 1987, Li et al. 2013, Coates et al. 2004, Meraner et al. 2008). Because of mtDNA's maternal inheritance, absence of intermolecular genetic recombination, fast evolutionary rate relative to nuclear DNA, availability of efficient PCR primers, and wealth of comparative data (Feng et al. 2012), it has been used extensively in studying population diversity, phylogeography and phylogenetic relationships at various taxonomic levels (Yang et al. 2008). Sequences encoding mitochondrial cytochrome oxidase subunit II (CO II) are shown to be appropriate for intraspecific analysis due to the high degree of polymorphism observed (Li 2010, Liang et al. 2011, Li et al. 2013). Cytb gene has been documented to be sensitive in detecting genetic diversity and population genetic structure and has been used for analysis of populations in other lepidopteran species (Gao et al. 2011). This study aimed at identifying that whether geographic isolation could lead to genetic variation of L. glycinivorella, a species that is known to have a weak flight ability. A total of 159 partial mtDNA CO II gene and 178 partial mtDNA Cytb gene sequences were employed to assess geographic population structure, genetic diversity, gene flow, and intraspecific and interspecific population differentiation in ten L. glycinivorella populations in the northeastern China. Determining whether the genetic distance of this species was correlated with the geographical distance among populations or not requires the type of molecular phylogenetic study developed here. This baseline information is also critical for genetically-targeted management of L. glycinivorella within different crop regions.
Materials and Methods
Sample collection. A total of 337 samples of last instar larvae of L. glycinivorella were collected from northeastern China in September 2013. The samples comprised ten populations of L. glycinivorella: Daqing (DQ), Dehui (DH), Gongzhuling (GZL), Harbin (HRB), Heihe (HH), Mudanjiang (MDJ), Jiamusi (JMS), Qiqihar (QQHR), Shenyang (SY), and Suihua (SH). Geographic locations are given in Fig. 1. All individuals were identified based on morphological characteristics, and samples were preserved in anhydrous ethanol and stored at -20 °C until DNA extraction.
DNA extraction, PCR amplification and sequencing. Genomic DNA was extracted according to a standard phenol-chloroform protocol (Sambrook & Russell 2001). The DNA was diluted to obtain a final concentration of 100 ng/mL. The CO II gene was PCR amplified using a set of universal primers CO II F( 5′-TAGTGCAATGGATTTAAACC-3′) and CO II R: (5′-GTTTAAGAGACCAGTACTTG-3′) (Folmer et al. 1994). And the following is primers of Cytb gene: CYTB1 (5′-TATGTACTACCATGAGGACAAATATC-3′), CYTB2 (5′-ATTACACCTCCTAATTTATTAGGAAT-3′) (Folmer et al. 1994). PCR amplification was performed using a Well Thermal Cycler (EDC-810) starting with 4 min of denaturation at 94°C, followed by 35 cycles of denaturation at 94°C for 50s, annealing at 48.3°C for 50 s, and extension at 72°C for 50 s, with a final extension at 72°C for 5 min. The reaction mixture contained approximately 3 μL of diluted genomic DNA as a template, 5 μL of 10×PCR buffer (100 mmol/L Tris-HCl, pH 8.3, 500 mmol/L KCl, 15 mmol/L MgCl2), 1 μL of each primer, 4 μL of dNTP, and 0.5 μL of DNA polymerase (Trans GenBitech, 5 U/mL), resulting in a total volume of 50 μL with sterilized water. The PCR products were gel-purified using an agarose gel DNA purification kit (Trans GenBitech) following the manufacturer's instructions. The purified fragments were sequenced by the Beijing Huada Gene Research Center. Sequences were determined in both directions (using the same primers individually as for the PCR), and the electro-pherograms were verified by eye.
Data analysis. Nucleotide composition and variable sites were analyzed in MEGA 5. 0 (Tamura et al. 2011). The genetic diversity indices of mtDNA, such as nucleotide diversity (π) (Lynch & Crease 1990) and haplotype diversity (Hd) (Nei 1987) were calculated using DnaSP 5. 0 (Librado & Rozas 2009). The demographic history of L. glycinivorella was examined with the neutrality statistics of Tajima's D and Fu's Fs test (Kimura 1983, Tajima 1989), which can indicate whether population expansion has occurred. The geographical distance among collection sites was calculated according to latitude and longitude. Genetic relationship among haplotypes was reconstructed using the Maximum likelihood method in MEGA 5.0 (Tamura et al. 2011). A bootstrap analysis with 1000 replicates was used to evaluate phylogenetic relationships. We chose the method of Maximum likelihood (Felsenstein 1985) to construct a phylogenetic tree of different haplotypes according to the Kimura 2-parameter model. To depict phylogenetic and geographical relationships between the haplotype sequences, a haplotype network was created with the median-joining method using Network 4.6 (Bandelt et al. 1999). A hierarchical analysis of molecular variance (AMOVA) was performed to reveal the geographical structure of genetic variation using Arlequin 188.8.131.52 (Excoffier et al. 2005). The significance of the fixation index (Fst) was tested with 1000 permutations of the data set. Using the formula Fst=1/(1+2Nm), which is specific to organelle genetic data (Takahata & Palumbi 1985, Goldberg & Ruvolo 1997), we derived the values for Nm.
Genetic diversity, Tajima's D and Fu's Fs test in different geographic populations of Leguminivora glycinivorella based on CO II
Genetic diversity. A 683 bp fragment of the mitochondrial CO II gene and a 415 bp fragment of the mitochondrial Cytb gene were amplified and sequenced from all of the collected samples. The average base composition for CO II gene was as follows: A=35.4%, T=40.5%, C=13.4%, G=10.7%. Another composition was found for Cytb gene: A=32.6%, T=41.0%, C=16.0%, G=10.4%. The results reveal a significantly high value for A/T, which conforms to the composition and structure characteristics of the mtDNA gene sequences of Lepidoptera (Jermiin & Crozier 1994, Frati et al. 1997, Nei & Kumar 2002). Under the Kimura 2-parameter model, the estimated transition/transversion bias (R) for CO II gene was 2.3, which conforms to the principle that transition is greater than transversion in more closely related populations (Simon et al. 1994, Frati et al. 1997). Among the 337 individuals, there were no insertions or deletions observed in the examined sequences. Sixteen haplotypes were identified in the 159 samples of CO II gene, and Cytb gene contained fourteen haplotypes in the 178 individuals. The sequences were deposited in GenBank (KJ540178–KJ540193 for CO II haplotypes H1–H16, and KM358122–KM358135 for Cytb haplotypes h1–h14). The number of haplotypes ranged from two to five for each sampled population.
The genetic diversity indices, such as haplotype diversity (Hd) and nucleotide diversity (π) (Nei & Li 1979 Smith et al. 2006) are presented in Table 1 and Table 2. The mean haplotype diversity and nucleotide sequence diversity of CO II gene in the ten populations were 0.463 and 0.00170, respectively. Haplotype diversity ranged from 0.143 (MDJ) to 0.700 (HH), and nucleotide diversity ranged from 0.00042 to 0.00620. Haplotype diversity of Cytb gene ranged from 0.212 (MDJ) to 0.595 (HH), and nucleotide diversity ranged from 0.00068 to 0.01973.
Tests for neutral evolution (Tajima's D and Fu's Fs test) were performed to identify the presence of a selective sweep or a balancing selection in L. glycinivorella populations (Harpending et al. 1998). Statistical significance was not found for all populations in both Tajima's D and Fu's Fs test (Table 1 and Table 2), which indicated L. glycinivorella did not experience a recent population expansion (P > 0.05) among the ten populations. The results suggested a significant correlation between the observed and expected outcomes.
Phylogenetic and network analyses. Based on the principle of selecting a closely related species as the outgroup, Grapholita molesta (WQ1)(GenBank: KF552028.1) and Zeiraphera diniana (WQ2) (GenBank: FJ647109.1) were chosen as the outgroup (WQ1 for CO II and WQ2 for Cytb). The Maximum likelihood trees based on Kimura 2-parameter distances were constructed. All haplotypes in the populations were significantly separated from outgroup, but there was no clear evidence of a geographical spectrum among the haplotypes, and each cluster confidence coefficient was low. The haplotypes (CO II gene) of nearer geographical populations were divided into the same cluster (such as H5 for QQHR and H3 for DQ), but the haplotypes of further geographical populations were also divided into the same cluster (such as H3 for GZL and H10 for JMS ), the similar results were found for Cytb. Overall, the phylogenetic trees did not reflect geographical position, indicating a lack of obvious geographic structure (Fig. 2).
To further depict the phylogenetic and geographical relationships among the identified sequences, haplotype networks were constructed with the median-joining method using Network 4.1 software (Fig. 3). The resulting network exhibited an approximate star-like pattern surrounding haplotypes H1 and H7. Hyplotype H1 was the most common (72.3%), and was shared by all ten populations. Additionally, H7 was shared by five sample regions and was present in 11.9% of all individuals. Five individuals displayed as haplotype H3, four individuals were haplotype H5, and there were two individuals showing as haplotypes H2, H4, H8 and H9, respectively; whereas the other haplotypes were only found once and were restricted to a single population, which is indicative of rare haplotypes. Moreover, similar results were found for Cytb (not listed).
Population structure analysis. The AMOVA analysis based on CO II gene haplotype frequencies revealed that 96.36% of the genetic variation occurred within populations, and 3.64% of the genetic variation occurred among populations, the similar result revealed in Cytb (Table 3). Therefore, genetic variation within populations was significantly greater than the genetic variation among populations, which suggests that the genetic differentiation of L. glycinivorella populations in northeastern China occurred primarily within populations, whereas the genetic variation among populations was relatively low.
Gene flow and genetic differentiation analysis. The fixation index (Fst) is a measure of the variance in gene frequencies between populations (Garcia et al. 2003), and bears a direct relationship to Nm, the product of a population's effective size and its female migration rate per generation (Wright 1969, Goldberg & Ruvolo 1997). Nm is thus the absolute number of female individuals who migrate among populations per generation on average. The genetic distances (Fst) and per-generation migration rates (Nm) between pairs of 10 populations are shown in Table 4 and Table 5. The pairwise genetic distances (Fst) of 45 pairs of populations ranged from - 0.0420 to 0.2225. All of these showed no statistically significant genetic differentiation (P > 0.05), suggesting that all of the populations form one genetic group. In addition, gene flow among the 10 populations was estimated by Nm, which is the expected number of migrants exchanged among populations in each generation. All Nm values between pairs of populations were greater than one (Table 4 and Table 5). These results suggest that extensive gene flow occurred among all ten populations of L. glycinivorella.
Genetic diversity, Tajima's D and Fu's Fs test in different geographic populations of Leguminivora glycinivorella based on Cytb
Analysis of molecular variance (AMOVA) for the CO II and Cytb sequences in Leguminivora glycinivorella 10 populations
The Fst value and gene flow Nm among ten populations of Leguminivora glycinivorella based on CO II
The Fst value and gene flow Nm among ten populations of Leguminivora glycinivorella based on Cytb
The natural logarithm of geographical distance (km) (above the diagonal) and pairwise genetic distance (below the diagonal) among Leguminivora glycinivorella populations based on CO II and Cytb (in parentheses).
The pairwise CO II and Cytb gene genetic distances were calculated with Kimura 2-paramter (Table 6). Mantel test showed no significant linear dependence between genetic distance and geographic distance both in CO II gene (r=0.0492, p=0.595>0.05) and Cytb gene (r=-0.2143, p=0.116>0.05) (Fig. 4), suggesting that longterm geographical isolation did not lead to genetic variation among the populations of L. glycinivorella.
Although external morphological characteristics are highly similar among L. glycinivorella populations, geographic isolation can result in genetic variation. Therefore, the rapid and accurate identification of whether L. glycinivorella populations have experienced genetic variation is important for scientific research and agricultural pest control. DNA barcoding offers a standardized system for determining whether species are the same or closely related based on the analysis of small fragments of DNA (Kruse & Sperling 2002, Landry et al. 1999). This study analyzed the genetic variation and gene flow (Nm) in ten L. glycinivorella populations based on mtDNA CO II and Cytb, which have a fast evolutionary rate (Meraner et al. 2008, Coates et al. 2004). Genetic variation in populations is created by evolutionary and demographic processes generating either heterogeneity or homogeneity among populations, and this variation determines the evolutionary potential of species. Population genetics provides models and tools for the interpretation of the processes that shape population structure. Sequences encoding mtDNA CO II and Cytb have been shown to be appropriate for intraspecific analysis due to the high degree of polymorphism observed (Pfunder et al. 2004).
Median-joining network for CO II showed that H1 was the most common haplotype (72.3%), and shared by all ten populations (similar result for Cytb), suggesting that potential adaptive exists in L. glycinivorella populations due to neutral processes like genetic bottleneck effects. However, the rare haplotype in each population also revealed that genetic differentiation existing among populations to some extent.
The mean haplotype diversity (hd) of the L. glycinivorella populations was medium/low. The ability to adapt to external environment condition showed by L. glycinivorella was relatively weak, which may be related to its biological characteristics. Perhaps limited habitat specialization and abundant host plant may cause relatively low genetic diversity within L. glycinivorella populations. In addition, chemical control that is widely used for the prevention and control of L. glycinivorella, may reduce the population size such that one common genetic structure perists. Haplotype diversity (hd) and nucleotide diversity (π) showed differences in each population, and this may due to the wide distribution of L. glycinivorella in northeastern China. Genetic structure varies in different geographic population experiencing long term natural selection, and relationship between genetic structure of populations and degree of gene flow has proven to be an important evolutionary and variable (Zu 1999).
In this study, genetic diversity was most abundant in HH population. Studies have shown that L. glycinivorella belongs to the “long-day insect” group, diapausing when illumination time less than 15 h per day. The most serious injury of L. glycinivorella happens in July to August, when illumination time of high latitudes region (HH) is longer than low latitudes, such as HRB and other regions. Therefore, the beginning period of L. glycinivorella outbreak in the HH region occurs earlier than in HRB (Zhang 2013).Thus we may speculate that L. glycinivorella originated in the north and there have been founder effects as the L. glycinivorella species expanded its range southward, which leads to abundant genetic diversity in HH. In addition, we did obtain the samples randomly, but no reports could be found about whether different host plants varieties will impact on the biological characteristics of L. glycinivorella,. Studies have shown that the Thrips species of Mesothrips develop different behavior characteristics and life history on different Ficus host (Tree &Walter, 2009).
Genetic variation within and among populations is explained by several processes, such as genetic drift, effective migration, natural selection, range fragmentation, expansion, habitation and low rates of mitochondrial evolution (Slatkin 1985, Avise 2004, Grant et al. 2006). Historical factors play a great important role in population phylogeny and evolution, Tajima's D and Fu's Fs test were both not statistically significant for all populations, indicating that L. glycinivorella did not experience a recent population expansion (P > 0.05) among the ten populations, population size remained relatively stable state.
Lower Fst values indicate a higher level of gene flow (Nm) and lower genetic differentiation exists among populations (Garcia et al. 2003). Fst values increase with greater geographical separation of populations, implying isolation by distance. However, in our study, genetic differentiation in the populations was not highly associated with their geographic distribution. In addition, all populations from the ten locations had no apparent divergence and shared a great amount of gene flow. The Nm for all groups was greater than one, suggesting that temporary reproductive and geographic isolation did not act as a barrier for gene flow. This result was inconsistent with our original expectation where genetic distance of this species should be significantly correlated with the geographical distance between populations. This result also reflects that the level of gene flow among populations is not entirely determined by the migratory flight ability of insect. Fortunately, Fu (Personal communications) from the Chinese academy of agricultural sciences has found a small amount of L. glycinivorella in North Huang City island (At the junction of the Yellow Sea and the Bohai Sea which is located dozens of miles from mainland) using light trap for many years. So, we speculate that L. glycinivorella might have genetic exchange through other ways such as air-flow spread (passive migratory) or vectored by plant material transportation. Nevertheless, some potential factors should be studied further to understand the population structure and gene flow of L. glycinivorella.
The Maximum likelihood cluster results suggested that most of the populations were clustered together, showing no significant phylogeographic structure exists. The resulting network further supported this finding. The data presented here from mtDNA CO II and Cytb gene sequences reveal low genetic diversity and insignificant genetic differentiation existing in the L. glycinivorella of northeastern China. However, we examined only two portion of their entire genome in this study. The use of multiple genetic marker systems could increase the resolving power of future genetic studies (Gruenthal et al. 2007). Further studies using nuclear markers and more samples are needed to extend and corroborate the present findings. These studies would help us to understand the comprehensive population evolution and gene flow in L. glycinivorella, and thus achieve better prevention and control.
We thank Keith S. Summerville and two anonymous reviewers for helpful reviews and recommendations for improvement, Dr. Wenpeng Sun for acquisition of specimens and helpful insights. We also thank Heilongjiang Academy of Agricultural Sciences and many others for allowing us to examine specimens during visits. This work was supported by the Fund of Common Wealth Industry (agriculture) Special Research (201103002) and the earmarked fund for Modern Agro-industry Technology Research System (CARS-04).