We isolated two cDNAs, termed D7 and C2 in the present study, from a cDNA library of the 16-cell embryo of the sea urchin Anthocidaris crassispina. The nucleotide sequence was determined completely for D7, and partially for C2. D7 does not have any significant open reading frames. Both D7 and C2 contain a common sequence that is 62% homologous to the sea urchin retroposon family 1 (SURF1). The SURF1 is a short interspersed repetitive element identified from the sea urchin Strongylocentrotus purpuratus, and is reported to be transcribed by RNA polymerase III. The structural feature of D7 and C2, however, suggests that they may be transcribed by RNA polymerase II. RT-PCR analyses revealed that (1) both D7 and C2 transcripts exist as a maternal RNA in the egg, (2) they appear evenly distributed in the 16-cell embryo, and (3) C2 transcripts are present throughout the development up to the gastrula, while D7 transcripts decrease in amount after the early cleavage stage.
Short interspersed repetitive elements (SINEs) are present in most, probably all, eukaryotic genomes. SINEs appear to have been amplified and dispersed in the genome via RNA intermediates, and are referred to as retroposons (Rogers, 1985; Weiner et al., 1986). Among SINEs, the human Alu family and the mouse B1 family are homologous to 7SL RNA of the signal recognition particle, and are therefore considered to be of 7SL RNA origin (Rogers, 1985). However, all other SINEs appear to have been derived from tRNAs (Lawrence et al., 1985; Daniels and Deininger, 1985; Sakamoto and Okada, 1985; Endoh and Okada, 1986; Matsumoto et al., 1986). Although functions of SINEs are still uncertain, it is clear that amplification and dispersion of SINEs have greatly contributed to genomic evolution in many species as a mechanism to maintain the fluidity of genomes (for reviews, see Okada, 1991a, b).
In studies of the sea urchin genome, some short repetitive sequence families have been reported (Posakony et al., 1981; Carpenter et al., 1982; Johnson et al., 1984; Cohen et al., 1985; Nisson et al., 1988; Calzone et al., 1988). The best characterized is the sea urchin retroposon family1 (SURF1). The SURF1 was first identified as an element located upstream of the muscle actin gene of the sea urchin Strongylocentrotus purpuratus (Hickey et al., 1987). Approximately 800 copies of the SURF1 are estimated to be dispersed in the genome (Nisson et al., 1988). The SURF1-1, a member of the SURF1, has the following features characteristic of SINEs: (1) it consists of three distinct regions, the tRNA-related, the tRNA-unrelated, and the AT-rich regions, (2) it is flanked by direct repeats, and (3) it is transcribed by RNA polymerase III in vitro (Nisson et al., 1988).
In the present study, we isolated two cDNAs that contain the SURF1-like sequence from a cDNA library of the 16-cell embryo of the sea urchin Anthocidaris crassispina. We present the structures and the expression patterns during the development. The structural features suggest that they may be transcribed not by RNA polymerase III but by RNA polymerase II. We also discuss possible functions as well as the evolutionary origin of the SURF1-like sequence.
MATERIALS AND METHODS
Sea urchin embryo
A. crassispina were collected near the Noto Marine Laboratory, Kanazawa University. Gametes were obtained by intracoelomic injection of 0.5 M KCI. Eggs were fertilized in filtered sea water, and embryos were cultured at a concentration of 3 × 103 embryos/ml in filtered sea water at 20°C with gentle stirring.
Eggs were inseminated in filtered sea water containing 1 mM 3-amino-1,2,4-triazole to prevent the fertilization membrane from hardening (Showmann and Foerder, 1979). The fertilization membrane was removed by the hatching enzyme as described previously (Yamaguchi et al., 1994). Denuded embryos were cultured up to the 16-cell stage in sea water at 20°C at a concentration of 5 × 104 embryos/ml. The 16-cell embryos were washed twice with ice-cold calcium-magnesium-free sea water containing 1 mM ethylene glycol-bis (β-aminoethyl ether) tetraacetic acid (EGTA). Approximately 2 × 106 embryos were applied to the Hitachi SRR6Y elutriation system, and three types of blastomeres, the micromere, the mesomere and the macromere, were elutriated and fractionated depending on the size by an increase of flow rates (Yamaguchi et al., 1994).
A. crassispina 16-cell embryo cDNA library
Total RNA was isolated from the 16-cell embryo by homogenization in guanidium thiocyanate followed by LiCI precipitation (Cathala et al., 1983). Polyadenylated RNA was prepared from total RNA by oligo(dT)-cellulose chromatography. Complementary DNA copies of polyadenylated RNAs were synthesized by cDNA synthesis system (Amersham) using oligo(dT) as a primer. Complementary DNA library was constructed in λ,gt11 (Amersham) according to the manufacturer's instructions. It included 1.8 × 106 independent clones, and their insert size was 1.9 kb in average. The cDNA library was amplified once in our experiment.
Poly(A)+ RNA was isolated from each blastomere of the 16-cell embryo as described above. 32P-labeled cDNA was synthesized from 2 (μg poly(A)+ RNA with 100 mCi [α-32P]dCTP (600 Ci/mmol) and 100 units MMLV reverse transcriptase (BRL) using oligo(dT) as a primer. Approximately 2 × 104 recombinants were differentially screened for the 16-cell embryo library using labeled cDNAs as probes. Filters (Hybond-N, Amersham) were hybridized in 5 × SSC, 5 × Denhardt's solution, 0.3% SDS, 60% formamide containing denatured salmon sperm DNA (100 μg/ml) and labeled cDNA (5 × 106 cpm/ml) at 37°C. The filter was washed at 55°C in 0.1 × SSC, 0.1% SDS, and then exposed to a film (Kodak X-omat AR) for 7 days at -70°C using an intensifying screen. Isolated recombinants were subcloned into Bluescript II (Stratagene), and their sequences were determined by the dideoxy-chain termination method (Sanger et al., 1977).
Reverse transcription polymerase chain reaction (RT-PCR)
Total RNA was extracted from A. crassispina embryos by GTC/LiCI method as described above. Complementary DNA was synthesized from total RNA with Superscript reverse transcriptase (BRL) using oligo(dT) as a primer. RT-PCR was carried out using Taq DNA polymerase (Toyobo) on a thermal cycler, Quick Thermo Personal QTP-1 (Nippon Genetics). Two sets of oligonucleotides, P1S (5′-TGGATGCATTCCTTTTTGTCTCGT-3′) and P1A (5′-CAGCAGTAA-ATGGTATTGTGTTCC-3′), and P2S (S′-CACTAGCGTCCTTTGGCA-3′) and P2A (5′-GCATCCATACAGCAGCTG-3′), were used as a primer set for D7, which corresponded to nucleotide 642-665 and 1243-1266, and 2-19 and 261-278, respectively. For C2, P3S (5′-TTACTCACGTT-ACATTTGGAAAGG-31) and P3A (5′-GTGTGCATTGTTGTAACTAAA-ACT-3′) were used as a primer set, which were directed to nucleotide 14-37 and 389-412, respectively. Primers for the elongation factor were synthesized according to the sequence of the A. crassispina EF-1α (M. Saito and K. Yamasu, personal communication). A reaction mixture containing cDNA synthesized from 5 ng total RNA as a template in a final volume of 25 μl was subjected to PCR for 27-33 cycles of 92°C, 1 min for denaturing, 60°C, 1 min for annealing, and 72°C, 2 min for extension for primer sets of P1S/P1A and P3S/P3A, and for 27-33 cycles of 92°C, 1 min, 55°C, 1 min, and 72°C, 1 min for a set of P2S/P2A. PCR conditions for EF-1α was 24-30. cycles of 92.5°C, 30 sec, 50°C, 1 min, and 72°C, 1 min. After PCR, 5 μl of the reaction mixture was applied to a 1.5% agarose or a 3% NuSieve 3:1 agarose gel, and electrophoresed in 1 × TAE buffer. Gels were stained with ethidium bromide (0.5 μg/ml) to detect PCR products.
As a result of differential screening, we isolated 4 cDNAs from 2 × 104 recombinants, named B5, C2, D7, and K4 in the present study, which appeared to be specific to the micromere of the 16-cell embryo. Restriction analysis revealed that B5 and D7 are identical with C2 and K4, respectively. We determined the complete nucleotide sequence of D7, and the partial sequence of C2.
D7 consists of 1392 bp, but no significant open reading frame was identified in either strand. One end of D7 terminates with oligo(A), and also the polyadenylation consensus sequence (AATAAA) locates 7 nucleotides upstream from the oligo(A) (Fig. 1). This structural feature is characteristic of the 3′ end of RNA polymerase II transcripts (Proudfoot and Brownlee, 1976). We, therefore, refer to the end as the putative 3′ end in the present study. Tandemly repeated sequences of 52 nucleotides exist around nucleotide 800, whose homology is 88%.
Homology search for D7 sequence showed that the 5′ 210 bp has 62% homology to the sea urchin retroposon family 1 (SURF1), whereas the remaining 1.2 kb does not have any significant similarity to the sequences available through the DDBJ database. As other SINEs, the SURF1-1 consists of three distinct regions: the tRNA-related region, the tRNA-unrelated region, and the simple AT-rich region (Fig. 2). The 5′ portion of the tRNA-unrelated region is similar to the Spec repeat, and is referred to as the Spec repeat-related region (Nisson et al., 1988). The Spec repeat was first identified as a common sequence between Sped and Spec2, the ectoderm-specific mRNAs of S. purpuratus embryos (Carpenter et al., 1982). In the present study, the 3′ portion of the tRNA-unrelated region was found to be 57% homologous to the 3′ non-coding region of metallothionein B mRNA of S. purpuratus (Wilkinson and Nemer, 1987). Thus, we refer to this portion as the MTb-related region. The 5′ end of D7 corresponds to the boundary between the tRNA-related and the Spec repeat-related regions. That is, D7 lacks the tRNA-related region of the SURF1-1.
Transcription of the SURF1-1 by RNA polymerase III starts at the 5′ portion and terminates at the AT-rich region (Nisson et al., 1988). In order to confirm that the SURF1-like sequence constitutes a structural part of D7, we carried out RT-PCR using D7-specific primers, P2S and P2A. P2S and P2A are directed to the sequences in and out of the SURF1-like sequence, respectively (Fig. 2). The PCR amplified a DNA fragment of expected size and also with an expected internal Eco R I site (Fig. 3, lanes 5-7). This indicates that the SURF1-like sequence is a part of a longer transcript. The original sequence of D7 may include the tRNA-related region, which serves as a promoter for RNA polymerase III. The SURF1-like sequence in D7, however, is followed by the AT-rich region and also 16 TTTTs and 8 longer oligo(T)s (Fig. 1). These sequences are considered to be a strong termination signal for RNA polymerase III (Endoh et al., 1990). These sequences together with the presence of polyadenylation signal near the 3′ end suggest that D7 may be transcribed by RNA polymerase II.
Turning to C2 cDNA, it is about 2.3 kb in size. We determined the sequence of about 0.3 kb from one end and about 0.5 kb from another (Fig. 2). One end of C2 terminates the polyadenylation consensus sequence followed by poly(A). We, therefore, refer to the end as the putative 31 end. The 5′ region of C2 also contains the SURF1 -like sequence, which is flanked by short direct repeats. The SURF1-like sequence in C2 is 62% and 89% similar to the SURF1-1 and the SURF1-like sequence in D7, respectively. However, the orientation is inverted. As in the structural analysis of D7, we performed RT-PCR for C2 using C2-specific primers, P3S and P3A. P3S and P3A correspond to the sequences in the 5′ and the 3′ flanking regions of the SURF1-like sequence, respectively (Fig. 2). The PCR amplified a DNA fragment of expected size and also with an expected internal Eco R I site (Fig. 3, lanes 8-10). This result, together with the presence of direct repeats flanking the SURF1-like sequence, and also the polyadenyiation signal, suggest that the SURF1-like sequence in C2 may be also embedded within a RNA polymerase II transcription unit.
Figure 4 shows an alignment between the SURF1-1 and the SURF-like sequences in D7 and C2. Most SINEs include box A and box B, consensus sequences for the promoter of RNA polymerase lll, in the tRNA-related regions (Galli et al., 1981). Both the SURF1-1 and the SURF1-like sequence in C2 contain box B (GGTTCGANNCC), but no apparent box A was detected. It should be noted that the Spec repeat-related and the MTb-related regions (tRNA-unrelated region) are more conserved than the tRNA-related region between the SURF1-1 and the SURF1-like sequences (see discussion).
Distribution of D7 and C2 transcripts in the 16-cell embryo
To examine the spatial expression patterns of D7 and C2 within the A crassispina embryo, we isolated total RNAs from the micromere, the mesomere, and the macromere of the 16-cell embryo. We then carried out RT-PCR using oligonucleotides, P1S/P1A and P3S/P3A, as primer sets for D7 and C2, respectively. When analyzed by agarose gel electrophoresis, a DNA fragment of expected size for D7 (623 bp) and C2 (398 bp) was detected. This fragment was almost equal in amount among the three types of blastomeres (Fig. 5A). RT-PCR using another D7-specific primer set, P2S/P2A, also brought the same result (data not shown). Therefore, transcripts of D7 and C2 appear to be present evenly in the 16-cell embryo.
Expressions of D7 and C2 during development
To examine the temporal expression of D7 and C2 during the development, we carried out RT-PCR using P1S/P1A and P3S/P3A as primer sets for D7 and C2, respectively (Fig. 5B). D7-specific primers generated about 0.7 kb DNA fragment(s) of expected size throughout the development from the unfertilized egg to the gastrula. Some of the fragments, however, did not contain an expected internal Eco R I site, suggesting that PCR might co-amplify a non-specific DNA fragment with a similar size. Eco R l-digested fragments were detected in the unfertilized egg and the early blastula, but decreased in amount after hatching. Contrary to this, C2-spe-cific primers amplified a DNA fragment of expected size (Fig. 5B) and also with an expected internal Eco R I site (data not shown) throughout the development. These result show that both D7 and C2 are maternal RNAs, but their temporal expression appear to be regulated separately during development.
We isolated two cDNAs, D7 and C2, which contain the SURF1-like sequence, from a cDNA library of the 16-cell embryo of the sea urchin A. crassispina. RT-PCR analyses using primers directed to sequences in and out of the SURF1-like sequences indicated that the SURF1-like sequence constitutes a structural part of D7 and C2. Nisson et al. (1988) reported that there is little expression of the SURF1 family by RNA polymerase II in early embryos. Structural features of D7 and C2, however, suggest that they may be transcribed by RNA polymerase II because of the following three reasons. First, both D7 and C2 have the polyadenylation signal followed by oligo(A), which is a common feature of the 3′ end of RNA polymerase II transcripts (Proudfoot and Brownlee, 1976). Second, the SURF1-like sequence is followed by many AT-rich sequences, which serve as a strong stop signal for RNA polymerase III (Endoh et al., 1990). Third, the SURF1-like sequence in C2 is embedded in a longer transcript in an inverted orientation. D7 and C2, therefore, are the first examples suggesting that the SURF1 may be also present as RNA polymerase II transcripts. It is probable that the SURF1-like sequences would be inserted into some maternal RNA genes by retroposition.
Concerning the number of the Spec repeat, two estimates have been reported. Carpenter et al. (1982) demonstrate that 2-3,000 copies are present in the genome of S. purpuratus, and that four cloned versions of the Spec repeat do not have any features of retroposons. On the other hand, Nisson et al. (1988) estimate about 800 copies of the SURF1 in the genome. Judging from difference in the copy number between the Spec repeat and the SURF1, Nisson et al. (1988) suggest the following scenario for the formation of the SURF1 in the sea urchin genome. Initially, the Spec sequence became amplified and dispersed throughout the genome by a DNA mediated transposition, and one copy of this sequence was inserted just downstream of a tRNA gene. Subsequently, a transcript of this tRNA gene, which extended into the Spec sequence, was copied by the reverse transcriptase. Multiple copies of these retroposons were then dispersed in the genome. It is important to note that the tRNA-related region is less conserved than the Spec repeat-related region in the SURF1 -1, D7, and C2 (Fig. 4). If a rate of the sequence divergence is uniform within the SURF1 sequence, the scenario suggested by Nisson et al. (1988) cannot explain the above fact. Instead, it seems more likely that the Spec sequence was amplified, and some of them were inserted just downstream of tRNA genes that had already diverged in the genome. Subsequently, transcripts of these tRNA genes, which acquired the Spec sequence in the 3′ region, were reverse transcribed and dispersed in the genome. Deininger and Daniels have proposed that the tRNA-related regions of SINEs have been generated from tDNAs which have accumulated mutations that did not hinder the intrinsic functions of tRNAs (Deininger and Daniels, 1986).
Alternatively, it is possible to consider that the Spec repeat may have a function to be conserved. Inverted repeat sequences often serve as target sites for transcription factors, which interact facing each other as homotypic or heterotypic pairs (Johnson and Smith, 1992; Ellis et al., 1990; Cao et al., 1991). Recently, Anderson et al. (1994) reported that conserved inverted repeat sequence serves as a target site for a sea urchin maternal DNA-binding factor. They estimate about 460 copies of the inverted repeat target sites or single sites in the S. purpuratus genome. The Spec repeat may also serve as a target site for a DNA-binding factor. It is also probable that the Spec repeat sequence in transcripts may regulate stability or translation of other mRNAs containing Spec repeat sequence. Sped and Spec2 mRNAs, for example, include the Spec repeat sequence in the 3′ non-coding region, which is complementary to the Spec repeat-related region in D7 or the SURF1 -1. This implies that D7 or the SURF1 -1 transcript could hybridize with Sped and Spec2 mRNAs to destabilize them in the cytoplasm. Bruskin et al. (1981) reported that Sped and Spec2 mRNAs are rare or undetectable during the cleavage stage, but increase in amount from the blastula stage. In contrast, D7 transcript is present in the cleavage stage, but then decreases in amount (Fig. 5). Expression of the SURF1-1 is also similar to that of D7, that is, the highest around the 128-cell stage and drops as development proceeds (Nisson et al., 1988). These results suggest that D7 and the SURF1 transcripts may work as a negative regulator against the Spec repeat-containing mRNAs during the early stage of development. It may also explain the observation that C2 transcritp is present throughout development unlike D7 transcript (Fig. 5).
We wish to thank H. Endoh and M. Saiki for their critical reading and helpful suggestions on the manuscript of this paper. We are grateful to F. Matsuzaki and Y. Nabeshima for giving us an opportunity to use their experimental facilities. We are indebted to M. Matada for collecting and culturing sea urchins. We also thank M. Saito and K. Yamasu for personal communication of the Anthocidaris crassispina EF-1 α sequence. This work is supported by Grant-in-Aid to M.Y. from the Ministry of Education, Science, Sports and Culture of Japan (No. 07836006).