We analyzed the polymerase chain reaction-restriction fragment length polymorphisms (PCR-RFLPs) of the mitochondrial cytochrome b (cyt b) gene in wild populations of medaka from Korea and China. We surveyed 258 wild specimens from 75 different sites, and identified 17 mitotypes. Sequencing analysis of the complete cyt b gene (1141-bp) was subsequently carried out to infer the phylogenetic relationships among these mitotypes. Phylogenetic trees indicated two major clades, D and E, which were different from the Japanese clades (A, B and C). These two clades were completely identical to two clusters previously identified by RFLP analysis of entire mitochondrial DNAs. The geographic distribution of the mitotypes in clades D and E was consistent with the China-West Korean Population and the East Korean Population as defined by allozymic and karyological analyses. This agreement among different analyses suggests long-term isolation between the two groups. In the region where the distributions of two major clades overlapped, a limited extent of gene flow was observed. These results suggested the existence of some reproductive isolation mechanisms between the two clades, or the introgression between them followed by a random drift in each local population. Furthermore, clade D was subdivided into three subclades (D-I to D-III). The phylogenetic relationship and distribution pattern of subclade D-II suggested a dispersal event of medaka from China to southwest Korea. Our results also showed that the East Korean Population has recently expanded its distribution area because little diversity was observed in clade E.
Medaka, Oryzias latipes, is an egg-laying freshwater fish native to Japan, Korea and China. It inhabits marshes, ponds and irrigation canals amid rice fields in flat alluvial lowlands. Geographic variation in the biochemical characters of this species has been demonstrated by allozymes encoded in the nuclear genome, and it has been shown that the wild populations of medaka consist of four genetically different groups: the East Korean Population from eastern and southern Korea; the China-West Korean Population from China and western Korea; the Northern Population from the Sea of Japan coast of eastern Japan; and the Southern Population from the Pacific coast of eastern Japan and from western Japan (Sakaizumi et al., 1983; Sakaizumi, 1986; Sakaizumi and Jeon, 1987). However, it is evident that male and female progeny from hybrids among the four groups are fully fertile (Sakaizumi et al., 1992).
Karyological studies have indicated that specimens belonging to the China-West Korean Population had 2n=46 chromosomes including a large metacentric pair, and that the karyotype of western Korea is closely related to those from eastern and southwestern China. On the other hand, specimens from the remaining three groups (East Korean Population and Northern and Southern Populations) showed 2n=48 chromosomes without such large chromosomes (Uwa and Ojima, 1981; Uwa, 1986; Uwa and Jeon, 1987; Uwa et al., 1988). The geographic distribution of these two chromosomal forms in Korea and China is similar to that of two groups distinguished by allozymes, namely, the 2n=46 form (O. latipes sinensis) in western Korea and China, and the 2n=48 form in eastern and southern Korea (Kim and Lee, 1992: Kim and Moon, 1987; Uwa and Jeon, 1987; Uwa et al., 1988).
Studies of the restriction fragment length polymorphisms (RFLP) of the entire mitochondrial DNA (mtDNA) of medaka have confirmed the existence of four groups on the basis of the results of allozymic analyses (Matsuda et al., 1997a; Matsuda et al., 1997b). With regard to the wild population in Korea and China, RFLP analysis revealed a total of nine haplotypes that formed two distinct clusters (Matsuda et al., 1997a). The geographic distributions of these haplotypes were consistent with the China-West Korean Population and the East Korean Population. Although this analysis showed that distributions of the two groups overlapped in the western region of Korea, the extent of the gene flow remains to be clarified due to the small sample sizes studied (one individual per site). Furthermore, the phylogenetic relationships within and among the four groups remain to be clarified due to the limited number of polymor phic sites and the lack of informative outgroup data.
The mitochondrial cytochrome b (cyt b) gene has been one of the most frequently utilized segments of mtDNA because it is easy to align and has been characterized in many vertebrates, including several fish species (Brito et al., 1997; Hedges et al., 1993; Kocher and Stepien, 1997; Kocher et al., 1989; Meyer et al., 1990; Ortí et al., 1994; Zadoya and Doadrio, 1999). In this study, we surveyed the detailed genetic population structure in wild populations of medaka in Korea and China by PCR-RFLP analysis of this gene in order to clarify the genetic variability within and among populations, especially in western Korea. Furthermore, we determined the complete nucleotide sequences of the cyt b gene in all the mitotypes discriminated by the PCRRFLP analysis and inferred phylogenetic relationships among the mitotypes. The origins of the two continental groups of medaka and their evolutionary history were estimated based on the phylogenetic trees.
MATERIALS AND METHODS
Between 1986 and 2002, we collected 258 wild specimens of Oryzias latipes from 75 different sites in South Korea and China (Fig. 1). The collection sites are listed in Table 1. The specimens from Beijing and Shanghai in China were from wild stocks housed at the Faculty of Science, University of Tokyo, and the specimens from the wild population of Kunming in China were from the Faculty of Science, Shinshu University. We also examined an inbred strain, HSOK, derived from the East Korean Population (Hyodo-Taguchi, 1996).
Site numbers, collection sites, sample sizes (N) and observed mitotypes of Oryzias latipes. Numbers in parentheses indicate the number of individuals representing each mitotype. Asterisks indicate the samples that were sequenced.
In addition, three cyt b sequences from Japanese specimens were included for intraspecific phylogenetic analysis. The three individuals were two inbred strains, HNI (AB084752) derived from the Northern Population and Hd-rR (AB084753) from the Southern Population (Hyodo-Taguchi and Sakaizumi, 1993), and a wild specimen from Mooka (AB084748) in Tochigi Prefecture. Each individual belonged to one of the three Japanese clades (A, B and C), respectively (Takehana et al., 2003). Three species of the genus Oryzias, O. curvinotus (AB084754), O. luzonensis (AB084755) and O. mekongensis (AB084756), were used as outgroups. These have been classified into the bi-armed chromosome group with O. latipes (see Uwa, 1986).
DNA extraction and amplification of the cytochrome b gene
Total DNA was extracted from the caudal fin by proteinase K digestion, phenol:chloroform extraction and isopropanol precipitation (Shinomiya et al., 1999), and the DNA samples were dissolved in TE buffer (1mM EDTA, 10mM Tris pH 8.0).
The segment including the complete cyt b gene was amplified by a PCR with the primer pair CytbFa (5′-AGG ACC TGT GGC TTG AAA AAC CAC-3′) and CytbRVa (5′-TYC GAC YYC CGR WTT ACA AGA CCG-3′) (Takehana et al., 2003). The reaction mixture for amplification by PCR contained 0.2mM of dNTPs, 0.25 μM of each primer, 1 μl of template DNA (below 100ng), and 0.6 units of Ex Taq polymerase and Ex Taq Buffer (TaKaRa) in a total volume of 25 μl. Amplifications were performed with a GeneAmp PCR System 9700 (Applied Biosystems) under the following conditions: denaturation at 94°C for 2min, 30 cycles of amplification (94°C for 1.5min, 55°C for 2min, 72°C for 2min), and a final extension at 72°C for 1min.
Amplified segments were digested with two restriction endonucleases (Hae III and Rsa I), in accordance with the instructions provided by the suppliers. The fragments were separated by electrophoresis in 6% polyacrylamide gels. The RFLP patterns were visualized and photographed under UV light after ethidium bromide staining. Each distinct restriction fragment pattern produced by either of the two endonucleases was assigned a small letter code in alphabetical order (a, b, etc.). Thus, each individual was finally assigned a two-letter composite mitotype. In this study, we have termed the cyt b type detected by RFLP, the mitotype, as a matter of convenience.
Sequencing and phylogenetic analysis
We sequenced 27 individuals, including all the mitotypes identified by the PCR-RFLP analysis. The sequenced samples are shown with an asterisk (*) in Table 1.
Nucleotide sequences were determined directly from the PCR products. DNA sequencing was performed using an ABI PRISMTM 310 system with an ABI BigDye Terminator Cycle Sequencing FS Ready Reaction Kit (Applied Biosystems) in accordance with the manufacturer's instructions. The following primers were used for sequencing: CytbFa, CytbFb (5′-CAA ATA TCA TTT TGA GGG GCC ACT GT-3′), CytbFc (5′-CGA CAA AGT ATC CTT CCA CCC TTA CTT-3′), CytbFd (5′-CCC TAT TCT ACA CAC CTC TAA ACA ACG-3′), CytbFe (5′-CTC GTC AGT TGC ACA CAT CTG CCG-3′), CytbRVa, CytbRVb (5′-ACT GAA AAT CCC CCT CAA ATT CAT TG-3′), CytbRVc (5′-CCT CCA AGT TTG TTT GGA ATT GAT CGT AG-3′), and CytbRVd (5′-GCA TGT ATA TTC CGG ATT AGT CAG CCG TA-3′) (Takehana et al., 2003).
DNA sequences were aligned using the multiple-sequence alignment program CLUSTAL X, version 1.81 (Thompson et al., 1994). Phylogenetic relationships were analyzed by the neighbor-joining (NJ), maximum-parsimony (MP), and maximum-likelihood (ML) methods using PAUP* version 4.0b10 (Swofford, 2001). In the NJ analysis, pairwise sequence divergences were calculated under Kimura's two-parameter model (Kimura, 1980). MP analysis was performed with the heuristic algorism (10 random replications), using equal weighting for all substitutions. ML analysis was based on the two substitution types model (HKY85; Hasegawa et al., 1985) with observed base frequencies, using the heuristic algorithm (10 random replications). The robustness of the internal branches of the NJ, MP and ML trees was executed by 1000, 1000 and 100 bootstrap replications (Felsenstein, 1985), respectively. The literature on the interpretation of bootstrap proportions (BPs) has not yet reached a consensus (Felsenstein and Kishino, 1993; Hillis and Bull, 1993). We followed the interpretation by Shaffer et al. (1997), which considers BPs >90% to be highly significant, 70–89% as marginally significant, and <70% as constituting limited evidence of monophyly. The sequences reported in this paper have been deposited in the DDBJ/EMBL/GenBank under accession numbers AB084750, AB084751 and AB100928 to AB100952.
We used two restriction endonucleases to analyze the fragment patterns of the segments (1241-bp) including the entire cyt b gene. The fragment patterns detected are shown in Table 2. Thirteen different patterns were found for Hae III (a-m) and five for Rsa I (a-e), and 17 mitotypes were observed from the composite fragment pattern (Table 3). Eleven mitotypes were found at only one site, one at two sites, and one at three sites. Four mitotypes were observed at more than 10 sites. Most sites had a specific fixed mito-type, and a polymorphic mitotype composition was only observed at 10 sites (Table 1).
Fragment patterns and molecular sizes (bp) generated by digestion of the amplified segment (1241-bp) comprising the complete cytochrome b gene by two endonucleases.
Mitotypes, fragment patterns (HaeIII/RsaI types), number of sites in which the mitotype was found, and sample sizes (N) ofOryzias latipes.
Sequencing and phylogenetic analysis
The 1141-bp region of the complete cyt b gene was successfully sequenced for all individuals. No insertions or deletions were detected, but there were 204 variable sites, in which 176 transitions and 42 transversions were observed (14 sites showed both transition and transversion), resulting in up to 3.9% amino acid differences. Most of the substitutions (173; 84.8%) were in the third codon position, with only 27 (13.2%) in the first and 4 (2.0%) in the second positions.
The NJ, MP and ML analyses resulted in similar tree topologies. Therefore, the NJ tree is shown in Fig. 2, as a representative of the three trees. Based on these analyses, all mitotypes were separated into two major clades (D and E) supported by BPs of the highest significance (100% in all analyses), and these two clades were distinguished from the three Japanese clades (A-C). Clade D could be subdivided into three subclades (D-I to D-III). These three subclades were supported by high BPs in the NJ (83–100%), MP (76–100%) and ML (76–100%) analyses (Fig. 2). The branching patterns among the subclades were completely consistent among the three analyses.
Clade D consisted of nine mitotypes (D1-D9), and clade E comprised eight (E1-E8). Although the same mitotype was usually clustered in one subclade, D6 was exceptionally shared between two subclades, D-II and D-III. The only sample from Kunming (site #75) formed subclade D-III.
The tree topologies strongly supported that the ingroup (O. latipes lineage) was a monophyletic group with high BP values in the NJ (92%), MP (92%), and ML (98%) analyses. Monophyly of the Japanese lineage (clades A-C) was also supported by high BPs (99% in NJ, 94% in MP, and 89% in ML). However, monophyly of the clade D-E was not clear due to the low bootstrap support (64% in NJ, 48% in MP, and 70% in ML). In clade D, subclade D-III split first, and then the remainder was divided into subclades D-I and D-II. The relationships were supported by BPs of 80%, 81% and 76% in the NJ, MP and ML analyses, respectively. The average sequence divergences were 15.1% among clades A-C, D and E, and 22.2% between ingroup and outgroup. The respective sequence divergences within clade D and clade E averaged 2.2% and 0.5%. The average sequence divergences among subclades D-I to D-III were 3.0–3.5%.
Geographic distribution of mitotypes
The geographic distribution of the mitotypes in Korea and China is shown in Fig. 3. The distribution patterns of the mitotypes demonstrated strong geographical associations. The mitotypes of clade D were found along western Korea, China and Taiwan. Mitotypes (D1-D5) belonging to subclade D-I were found in the northwestern region in Korea, Han River drainage area. Mitotypes (D6-D9) in subclades D-II and D-III were distributed in southwestern Korea, at three sites in China (sites #72, 73, 75) and at Ilan (site #74) in Taiwan. On the other hand, the mitotypes of clade E were mainly observed in the eastern and southern regions in Korea, while particular mitotypes were found in 10 sites along the western coast. Mitotypes (E1-E3) were distributed in the western region, while E4-E8 were distributed in the eastern and southern regions of Korea. Although the distribution of mitotypes from the two groups (clades D and E) overlapped in the western region of Korea, all populations except for Paltan (site #5) were fixed for mitotypes of either clade.
Concordance between cytochrome b and previous data
Allozymic analyses previously demonstrated two genetically distinct groups in Korea and China: the China-West Korean Population and the East Korean Population (Sakaizumi and Jeon, 1987). Karyological and morphological studies also indicated differences between the two groups (Chen et al., 1989; Kim and Lee, 1992; Kim and Moon, 1987; Uwa and Jeon, 1987; Uwa et al., 1988). Furthermore, RFLP analysis of the entire mtDNA in Korean wild populations confirmed the same two groups (Matsuda et al., 1997a).
We revealed that the mitotypes defined by the cyt b gene variations were divided into two groups (clades D and E) in Korea and China with a large genetic divergence. The distribution ranges of the mitotypes in the two clades were consistent with those of the previously described two groups: clade D corresponded with the China-West Korean Population (2n=46) and clade E with the East Korean Population (2n=48). This agreement among different analyses suggests long-term isolation between the two groups.
The samples from Taiwan have not yet been defined by allozymic and karyological analyses. In this study, we first demonstrated that the samples from Taiwan should be included in the China-West Korean Population, because the phylogenetic trees showed that the cyt b sequence of a sample captured from Ilan (site #74) was included in clade D.
We showed that mitotypes from the two groups (clades D and E) overlapped in western Korea. Mitotypes of clade D were found in the coastal and inland populations, whereas E1-E4 were discontinuously observed only in the coastal populations. These distribution patterns corresponded to those observed by RFLP analysis of the entire mtDNA (Matsuda et al., 1997a). Most populations in western Korea were fixed for a mitotype of either clade D or E, and low genetic variability was observed within populations (Fig. 3). This limited gene flow among local populations along with the large genetic divergence suggests some reproductive isolation mechanisms between the two groups, or introgression between them followed by a random drift in each local population. Although male and female progeny from hybrids between the two groups were fully fertile (Sakaizumi et al., 1992), it was not clear whether hybrid individuals also existed in wild populations. A detailed allozymic analysis for populations from this region will necessary to demonstrate which of these hypotheses is correct. This project is now in progress.
Although the mitotype (E4) was mainly found in eastern and southeastern regions of Korea, two populations from western region exceptionally contained the mitotype. One was Paltan (site #5) and the other was Bulgap (site #34). In Paltan (site #5), only this population contained mitotypes of both clades D and E, four mitotypes (D1, D3, D4 and E4) were observed. Two mitotypes (E1 and E4) were observed in Bulgap (site #34). It is suggested that this disjunct distribution pattern of E4 may have been caused by human action such as the release of fish.
Evolutionary history of medaka in Korea and China
Phylogenetic analyses of the cyt b gene sequences revealed differentiation of two major clades (D and E) within wild populations of medaka from Korea and China, and the distribution of the mitotypes in clades and subclades showed geographic associations. This can be explained by the fact that the dispersal of freshwater fish is restricted by geographic features such as mountains and sea, so that fish populations are confined to their own watershed, resulting in regional differentiation.
Several authors have proposed different molecular clocks for the cyt b gene: about 0.81% per million years (myr) for elasmobranchs (Cantatore et al., 1994), about 0.92%/myr for Sebastes fishes (Rocha-Olivares et al., 1999), and about 2.8%/myr for sticklebacks (Ortí et al., 1994; Rocha-Olivares et al., 1999). According to the faster divergence rate of 2.8%/myr, the divergence time among clades A-C, D and E is estimated at 5.4 million years ago (mya), and that among subclades D-I to D-III at 1.1-1.3 mya. Allozymic analysis indicated that the Nei's D values ranged from 0.71 to 0.88 between the two groups in Korea and China (Sakaizumi and Jeon, 1987). Using the calibration (1D = 5myr) proposed by Nei (1975), the two Korean groups are considered to have separated about 3.6-4.4mya. The divergence time estimated by the faster cyt b divergence rates roughly corresponded to the time estimated by the Nei's D values. Based on these inferences, we suggest the following hypotheses for the evolutionary history of medaka.
Our analysis suggests that the common ancestor of O. latipes was separated into three groups during the late Miocene or early Pliocene, corresponding to the Japanese groups (clades A-C), clade D and clade E. The boundary separating the China-West Korean Population and the East Korean Population corresponds to the backbone mountains in Korea such as the Taebaek Mountains and the Sobaek Mountains, which may have been a barrier between the two groups.
Subsequently, the China-West Korean Population (clade D) divided into three subgroups (subclades D-I to DIII) during the Quaternary (1.1-1.4 mya, based on the abovementioned molecular clock). The local populations around Kunming (site #75) were first isolated from the ancestral populations. The other populations diverged into two groups, the populations in northwestern Korea and the populations in eastern China.
The distribution of the mitotypes in subclade D-II is separated by the Yellow Sea, southwestern region in Korea and China. In contrast to the sequence divergences (1.6–2.8%) among the samples from China (sites #72, 73 and 74) in subclade D-II, the sequence divergences among the specimens from Korea were extremely low (0.1–0.2%). The phylogenetic trees showed that individuals from southwestern Korea are genetically closest to the individuals from Beijing (site #72) in subclade D-II with sequence divergences between them of 0.5–0.7%. These results may suggest that the local populations in eastern China expanded their distribution to southwestern Korea. The Yellow Sea is not more than 100 m in depth for the most part. The sea level dropped up to 130 m during the late Pleistocene (0.01-0.6 mya) (Kitamura, 2002). Therefore, it is possible that the regression of the Yellow Sea during the period may have permitted the migration. It is likely that the isolation between subclades D-I and D-II in Korea has been maintained by the Charyong Mountains.
Although mitotypes of clade E had a large distribution area, this clade is characterized by low diversity as evidenced by the short branch on the NJ tree. This low variation in the East Korean Population was also reported by allozymic analysis (Sakaizumi and Jeon, 1987). It is surmised that this decline in genetic variation can be attributed to a bottleneck effect, and that this clade has recently expanded its distribution area.
As mentioned earlier, the mitotypes of both clade D and E were observed in the western region of Korea. In this region, mitotypes of clade D were found in the coastal and inland populations, whereas E1-E3 were discontinuously distributed only in the coastal populations. On the other hand, the populations from eastern and southeastern Korea had E4-E8, and those from southwestern had E1-E2. It is likely that the local populations having the mitotype (E1) in southwestern region of Korea expanded its distribution range to the north along the west coast, and overlapped with the distribution range of clade D.
We are grateful to Dr. S. Hamaguchi (Niigata University) for his valuable advice. The kind help of Dr. H. J. Tsai (National Taiwan University) in the collection of the materials is greatly appreciated.