Wolbachia, a member of rickettsia found in the cells of many arthropod species, are cytoplasmically inherited bacteria which interfere with host's sexuality and reproduction. Wolbachia strains have been phylogenetically divided into A and B groups based on the nucleotide sequences of their ftsZ genes. In an attempt to further define the phylogenetical relationship among these endosymbionts, we cloned and sequenced the entire length of the groE operon of a Wolbachia harbored by a cricket. The operon encoded two heat shock proteins, which represented the third and fourth proteins of any Wolbachia ever characterized. Also, 800 bp stretches of the groE operons of several other Wolbachia were sequenced, and a phylogenetic tree was constructed based on the results. The groE tree defined the relationship among A group Wolbachia strains that had not been successfully resolved by the ftsZ tree, and suggested unexpected horizontal transmission of these bacteria.
INTRODUCTION
Wolbachia, a rickettsia-like microorganism, is present in the cells of many species of arthropod, mainly in insects (O'Neill et al., 1992; Juchault et al., 1994; Werren et al., 1995). These endosymbiotic bacteria are transmitted maternally through the host egg, and known to alter host reproduction in various ways including post-zygotic reproductive incompatibility, or cytoplasmic incompatibility (CI), in a wide range of insects (Barr, 1980; O'Neill et al., 1992; Breeuwer et al., 1992), parthenogenesis in wasps (Stouthamer et al., 1993), and feminization of genetic male in an isopod (Rousset et al., 1992). Molecular mechanisms underlying these phenomena are still unknown.
To date, except for the rRNA genes, only two protein coding genes from Wolbachia were sequenced, which are ftsZ involved in regulation of bacterial cell division, and dnaA essential for initiation of DNA replication (Holden et al., 1993; Bourtzis et al., 1994). Phylogenetic trees of Wolbachia strains in the various hosts have been constructed based on the sequences of 16S rRNA and parts of the coding region of ftsZ (O'Neill et al., 1992; Werren et al., 1995; Tsagkarakou et al., 1996). The fteZ tree had finer resolution than that of 16S rRNA, and divided Wolbachia into two major groups designated as A and B. It was suggested that the two had diverged from each other 58–67 million years ago (Werren et al., 1995). The tree also implied frequent horizontal transmissions of Wolbachia between distantly related insect orders. In A-group of Wolbachia, horizontal transmission takes place so frequently that there is increasing need for a less conserved sequence than that of ftsZ to understand detailed phylogeny and infection pathways of this clade.
The groE operon, encoding highly conserved bacterial heat shock protein GroES and GroEL which are also called HSP10 and HSP60, respectively, contains the noncoding, intergenic region between these two coding regions (Hartl, 1996; Segal and Ron, 1996). The intergenic sequence is almost completely neutral to natural selection, and, thus, believed to evolve faster than the coding sequences. The organization of the groE operons is highly conserved, and it is known that there is only one copy of the groE operon per genome in the Rickettsiaceae group to which Wolbachia strains belong (Segal and Ron, 1996; O'Neill et al., 1992; Roux and Raoult, 1995). Thus, the intergenic region of groE operon is a suitable material based on which we construct a phylogenetic tree of closely related Wolbachia strains.
In this study, we sequenced groE-homologous operons of Wolbachia including that from an infected cricket, Teleogryllus taiwanemma, and compared their sequences among several strains of A-group Wolbachia whose relationship has been quite ambiguous. As a result, we were successful in defining phylogenetic relationship among these Wolbachia strains with higher resolution than previously reported.
MATERIALS AND METHODS
Insect materials
Taiwan crickets, T. taiwanemma, were reared at 24°C under the 16 hr light and 8 hr dark conditions. They were fed on artificial diet, CA-1 (CLEA JAPAN), and tap water. To eliminate Wolbachia, crickets were given 0.25% (w/v) tetracycline hydrochloride at least for 3 generations. Moths of sub-family of Phycitinae, Ephestia kuehniella, Ephestia cautella and Plodia interpunctella were reared on wheat bran containing 10% (w/w) glycerol at 24°C with a photoperiod of 16 hr. An aposymbiotic strain of E. kuehniella was established by rearing the insects on a diet containing 0.04% (w/w) tetracycline hydrochloride for two generations. Drosophila simulans Hawaii (DSH) and Drosophila simulans Riverside (DSR) were provided by Dr. O'Neill. DNA was extracted using DNAzol™ reagent (GIBCO BRL) from dissected ovaries of T. taiwanemma. As for other insects, DNA was extracted from the whole bodies.
PCR amplification and sequencing
To amplify part of the groE sequence of Wolbachia, degenerate primers were designed on the basis of an alignment of Ehrlichia chaffeensis (GenBank accession no. L10917), Cowdria ruminantium (U13638) and Orientia (Rickettsia) tsutsugamushi (M31887) groEL gene sequences. The degenerate primers were: WgLf 5′-TGANGAAG-AAATTGCNCAAGT-3′ (E. chaffeensis groEL positions 417–437), WgLr 5′-CCTTCTTCAACTGCAGCTCTTGN-3′(1235–1213).
To clone its flanking regions, cassette PCRs (LA PCR in vitro Cloning Kit, TAKARA) were performed according to manufacturer's recommendations. Bands obtained were cloned into pGEM-T (Promega) and sequenced with T7 and SP6 primers using an automated sequencer (SQ-5500, HITACHI). To eliminate PCR errors, at least 3 clones were sequenced, or PCR products were directly sequenced. Primers designed for amplification of 800 bp fragments encompassing the intergenic region of the groE operon were: groEfl 5′-TGTATTAGATGATAACGTGC-3′ (Wolbachia groE operon positions 21–40), groErl 5′-CCATTTGCAGAAATTATTGCA-3′ (844–824). A-group specific primers of this region were also designed: groEAf 5′-TGATCAAGCCTATTAGC-3′ (41–57), groEAr 5′-GAGATTATTGCAA-CTTGTGCC-3′ (835–815). PCR conditions when these primers were employed were 30 cycles with 94°C 1 min, 52°C 1 min, 72°C 2 min.
The sequence analyzed in this study have been deposited in GenBank under accession numbers AB002286 for the full length of Wolbachia groE operon from T. taiwanemma, and AB002287-91 for the partial sequences of groE operons from other Wolbachia strains.
Phyiogenetic analysis
CLUSTAL W (Thompson et al., 1994) was used to align the sequences, to construct the NJ tree (including gap positions, with correction for multiple substitution), and to calculate bootstrapping probabilities (1000 resamplings). The program MEGA (ver. 1.02, Kumar et al., 1993) was used to estimate number of nucleotide substitutions per site (dA) and number of nonsynonymous and synonymous substitution per nonsynonymous (dN) and synonymous (ds) site, respectively, using Jukes-Cantor correction and excluding insertions-deletions.
RESULTS
groE operon of Wolbachia
We first amplified and cloned part of the groEL-homologous gene of Wolbachia harbored by a cricket, T. taiwanemma. This cricket is relatively large in size, and the ovary could be isolated easily without contamination by enterobacteria. PCR using degenerate primers reproducibly amplified a 0.8 kb fragment, which was shown to hybridize with the genomic DNA from the ovary by Southern blot analysis (data not shown). Secondly, the complete nucleotide sequence of the groE operon was determined (Fig. 1) by cassette PCRs based on the sequence of the 0.8 kb fragment obtained above. The intergenic region between the groES and groEL gene consisted of 90 bp, the length of which was typical of groE operons of Rickettsiaceae. The operon contained two ORFs that encoded 96 amino acids (GroES) and 552 amino acids (GroEL) with molecular mass of 10471 and 58965 Da, respectively. The operon contained typical ribosome-binding-sites (Stormo, 1986) with GGAA or GGAG poly-purine stretch.
The neighbour-joining analysis of the deduced Wolbachia GroEL amino acid sequence was performed (figure not shown), which indicated that Wolbachia clearly formed a clade with E. chaffeensis (86 out of 100 bootstrap replications). Congruence of the tree topology with that of the 16S rRNA tree (O'Neill et al., 1992; Roux and Raoult, 1995) confirmed that this gene is from Wolbachia.
For phyiogenetic analysis of Wolbachia strains, we chose a 800 bp region in the groE operon which included the entire sequence of the intergenic region and its flanking sequences (see Fig. 1), considering that these regions are highly susceptible to base substitutions. To amplify the 800 bp groES-L region, groE general primers were designed (Fig. 2), and the specificity of these primers was tested by PCR assay (Fig. 3). As a result, positive signals were detected from infected insects, but not from a naturally uninfected species and tetracycline-cured individuals, again confirming that this gene is from Wolbachia. It has been shown from the ftsZ gene analysis that T. taiwanemma contains B-group Wolbachia (our unpublished data), while E cautella harbors both A and B groups (Werren et al., 1995; Furukawa, unpublished data) which were designated as E cautella A and E cautella B, respectively. DSH, DSR, and E kuehniella contain A-group (Werren et al., 1995; Furukawa, unpublished data). Using groE general primers, we amplified and sequenced the groES-L region of Wolbachia from E. kuehniella, which was used to design A-group specific groE primers, groEAf and Ar (Fig. 2). PCR assay showed that these primers successfully amplified 800 bp fragment corresponding to the groES-L region from insects known to have A-group Wolbachia, but not from the one that contained B-group (see Figs. 2 and 3).
Comparison of the groES-L regions
The numbers of substitutions per site (dA) between the groES-L regions of the two representatives of A and B-group Wolbachia, from E. kuehniella and T. taiwanemma, respectively, were estimated. The dA of the intergenic sequence was 0.16 ± 0.048, which was similar to the average dA of the coding region, 0.16 ± 0.012. However, it should be emphasized that the intergenic region contained many insertions and deletions (13 bp insertions and 2 bp deletions in total, based on T. taiwanemma) which were not considered here in estimating the dA.
For the GroES and GroEL coding sequences, dN and ds were separately calculated. The values for dN were very low and similar to that of ftsZ(data not shown). The ds values for groES and groEL were 0.74 ± 0.18 and 0.58 ± 0.10, respectively, which were 1.3–1.6 fold higher than that of the fteZgene, 0.47 (Werren et al., 1995). Similar ds values were observed when the corresponding regions were compared between DSH and E. cautella B, 0.71 ± 0.18 for groES and 0.51 ± 0.096 for groEL.
Phylogenetic relationship
Phylogenetic relationship among 6 different Wolbachia strains was investigated by the Neighbour-Joining algorithm (Saitou and Nei, 1987), using the 800 bp groES-L region including the intergenic sequence of the groE operon (Fig. 4). The average sequence divergence between A and B-group was 17.2%, after Jukes and Cantor correction. Within the A-group, the divergence ranged between 0.27 and 2.3%. While insertions-deletions were not detected in the 4 strains of the A-group used in this study, the groE tree, thus constructed, clearly defined the phylogenetic relationship among them. There were only 2 bp substitutions between the groE sequences from DSH and E. cautella A, which were the same substitution number between the ftsZ genes of these two Wolbachia strains. In contrast, the groE sequences from E. kuehniella and E. cautella A, whose ftsZ genes are identical with each other (Furukawa, unpublished data), contained 17 bp substitutions, 5 bp of which were found in the intergenic region.
DISCUSSION
The groEL gene of Wolbachia
The groE operons in eubacteria are highly conserved, and GroEL homologs from numerous eubacterial genera have been identified as the major antigens (Dasch et al., 1990). In general, intracellular symbionts and parasites produce their GroEL homologs in large amounts (Choi et al., 1991; Vodkin and Williams, 1988; Stover et al., 1990). In an endosymbiont Buchnera of the pea aphid Acyrthosiphon pisum, a GroEL homolog called symbionin is selectively expressed (Hara et al., 1990). A histidine residue at the position 133 of symbionin is prone to autophosphorylation (Morioka et al., 1993), suggesting that this protein not only functions as molecular chaperone but also plays a role in signal transduction in this endosymbiotic bacterium (Gross et al., 1989; Morioka et al., 1994). In this study, we cloned and sequenced the groE-homologous operon of an endosymbiont Wolbachia from a cricket. The operon encoded the third and fourth proteins of Wolbachia that have been ever characterized. In Wolbachia GroEL, the His-133 of Buchnera GroEL had been replaced by a serine. Serine residue at this position was also observed in GroEL homologs of E. chaffeensis and Arabidopsis mitochondria (Swissprot accession no. P29197), which live in intracellular environment. It is conceivable that these serine residues also play a role in signal transduction in these bacteria and organelle through their phosphorylation.
Since the amino acid sequence of GroEL homologs is highly conserved, and their amino acid substitution is relatively free from the influence of biased base substitution typically known in the evolution of 16S rRNA (Hasegawa and Hashimoto, 1993; Jukes and Bhushan, 1986), it is widely used as an evolutionary chronometer (Viale, 1995). To date, the phylogenetic position of Wolbachia among alpha proteobacteria has been determined only by 16S rRNA (O'Neill et al., 1992; Roux and Raoult, 1995). The neighbour-joining analysis of the deduced amino acid sequence of Wolbachia GroEL supported the 16S rRNA tree, confirming that Wolbachia is positioned in the Rickettsiaceae family.
Wolbachia phylogeny by the groES-L sequences
Apparently recent spread of A-group Wolbachia among natural populations of D. simulans was reported (Turelli and Hoffmann, 1991), and it was considered to be due to human disturbance or transport (Werren et al., 1995). In addition, the ftsZ phylogeny suggested that frequent horizontal transmission of A-group Wolbachia among various distantly related hosts including Drosophila and Ephestia. However, the infection pathway among them is unclear because their sequences determined have been almost identical with each other (Werren et al., 1995).
It was shown that groES-L sequences analyzed in this study are very useful for phylogeny of A-group Wolbachia, because of its higher ds values and the presence of the intergenic region where insertions-deletions can provide useful information. The topology of the groE tree and its high bootstrap values clearly elucidated relationship among the 4 strains of A-group Wolbachia, for which no divergence had been successfully indicated by the ftsZ phylogeny. In this tree, E. cautella A formed a clade with DSH rather than with E. kuehniella (Fig. 4). Since E. cautella and E. kuehniella are moths that share a similar nich, it is considered that horizontal transmission of Wolbachia occurs frequently between the two. However, the tree suggests that horizontal transmission has not occurred between the moths, but between moths and flies as the most recent event. Though the direction of the transmission is not clear with this number of host species examined here, it is obvious that the groES-L sequence is, potentially, a powerful tool to resolve phylogeny and infection pathways of Wolbachia much more precisely than before.
Wolbachia is rapidly spreading all over the world and infecting even nematodes (Siloni et al, 1995), and seems to promote speciation of host by bringing incompatibility between host populations (Breeuwer and Werren, 1990; Coyne, 1992), suggesting that the Wolbachia infection can be a driving force of evolution. Further analysis of infection pathways among various hosts, will make clear the possible mechanisms of horizontal transmission, and the influences on recent speciation and evolution of host species.
Acknowledgments
We specially thank Dr. T. Tsuruhara (Tokyo Gakugei Univ.) and Dr. S. O'Neill (Yale Univ.) for providing insect materials. We also thank Mr. S. Furukawa (Tokyo Univ.) for providing unpublished data on ftsZ genes.