The blood clam Barbatia virescens expresses a unique heterodimeric hemoglobin consisting of chains I and II in erythrocytes. This is in sharp contrast to the tetrameric (α2β2) and polymeric two-domain hemoglobins of the congeneric species Barbatia reeveana and Barbatia lima. The 3′ and 5′ parts of the cDNA of B. virescens chain II have been amplified separately by polymerase chain reaction (PCR), and the complete nucleotide sequence of 690 bp was determined. The open reading frame is 477 nucleotides in length and encodes a protein with 158 amino acid residues, of which 120 amino acids were identified directly by the protein sequencing of the peptides obtained from digestions with trypsin, S. aureus V8 protease and pepsin. The mature protein begins with the blocked Ser, and thus the N-terminal Met is cleaved away. The molecular mass for the protein was calculated to be 17605 Da. The cDNA-derived amino acid sequence of B. virescens heterodimeric chain II shows the highest homology (42%) with that of B. virescens chain I, but shows lower homology (32–35%) with those of tetrameric α and β chains of B. lima. This indicates that B. virescens chains I and II do not correspond to B. lima α and β chains, namely the heterodimeric hemoglobin is a unique gene product expressed only in B. virecens.
The bivalves belonging to the family Arcidae contain abundant hemoglobins in circulating erythrocytes , but their compositions differ remarkably in different clams. For example, Anadara trapezia contains homodimeric (γ2) and tetrameric (α2β2) hemoglobins , Barbatia reeveana contains tetramer (α2β2) and polymer consisting of unusual 34 kDa two-domain chain (2D) , and the congeneric clam Barbatia virescens has only a heterodimeric hemoglobin (I–II) . In addition, we found that Barbatia lima subsp. collected from Amami-Oshima, Japan, expresses three types of hemoglobins: homodimer (δ2), tetramer (α2β2) and polymer of 2D and δ chains (Suzuki et al., in preparation). One of the most remarkable feature of Anadara (Scapharca) homodimeric and tetrameric hemoglobins is that the subunit assembly is “back-to-front” relative to vertebrate hemoglobin . This means that the tetrameric assembly has been acquired independently in vertebrates and molluscs.
As a first step to make clear the complex molecular evolution of clam hemoglobins, we are analyzing the primary structure of each constituent chain of hemoglobins of B. virescens and B. lima. Here we report the cDNA-derived amino acid sequence of chain II of B. virescens heterodimeric hemoglobin.
MATERIALS AND METHODS
The chain II of the hemoglobin of Barbatia virescens was isolated as described previously . Barbatia chain II was carboxymethylated and digested with trypsin, pepsin and S. aureus V-8 protease, respectively , and the peptides were separated by a reverse-phase chromatography. The column (Cosmosil 5C18-300, 2.5 × 150 mm, Nacalai Tesque) was eluted with a linear gradient of acetonitrile in 0.1% trifluoroacetic acid (TFA) at a flow rate of 1 ml/min. Some peptides were purified further by rechromatography. Peptides were sequenced with the manual Edman method . The peptic peptide P-3 with amino acid composition of Ser1, Glu1, Pro1, Ala3, Ile1 and Lys1 was further digested with acylaminoacid releasing enzyme (Takara), before sequencing .
mRNA was isolated from the erythrocytes of B. virescens with a FastTrack mRNA Isolation Kit (Invitrogen). The single stranded cDNA was synthesized with avian reverse transcriptase using oligodT adaptor of 5′GGATCCGAATTCCCCGGGT17 as a primer. The 3′ half of the cDNA of the chain II was first amplified for 30 cycles, each consisting of 0.5 min at 94°C for denaturation, 0.5 min at 45°C for annealing and 1 min at 72°C for primer extension, by polymerase chain reaction (PCR) . As a enzyme, Taq DNA polymerase (Promega) was used. The primers used are the oligo-dT adaptor described above and the redundant oligomer α (20 mer, a mixture of 32) of 5′GC(AC)AT(CA)TA(CT)CT(AC)ATGTA (TC)GC, based on the amino acid sequence of Ala-Ile-Tyr-Leu-Met-Tyr-Ala. Codon usage for Ala, Ile and Leu of B. virescens chain I was used to reduce redundancy. The 500 bp products thus amplified were subcloned in the SmaI site of pUC18 and sequenced by dideoxy chain termination method with Sequenase Ver.2.0 (United States Biochemical), BcaBEST DNA sequencing Kit (Takara) and non-RI Uniplex DNA sequencing Kit (Millipore).
The 5′ half of the cDNA was amplified as follows. The single-stranded cDNA was newly synthesized with the non-redundant primer b (24 mer): 5′CTCTTTAAATGTCTATATGCAAAG. This is complementary to the sequence 364–387 shown in Figure 1, and the poly A tail was added to the 3′ end with a terminal deoxynucleotidyl transferase. Then the 5′ half of the cDNA was amplified for 30 cycles, each consisting of 0.5 min at 94°C for denaturation, 0.5 min at 55°C for annealing and 1 min at 72°C for primer extension, using the oligo-dT adaptor and the non-redundant primer c (24 mer) of 5′CAATATCTTCTACGATGCTAGGGT, complementary to the sequence 333–356 shown in Figure 1. The 350 bp products thus amplified were subcloned in the SmaI site of pUC18 and sequenced.
RESULTS AND DISCUSSION
We have succeeded in amplifying the cDNA encoding B. virescens chain II by PCR. The complete nucleotide sequence of 690 bp was constructed by two PCR fragments amplified separately (Fig. 1). There was no sequence discrepancy in the overlapping region. The open reading frame is 477 nucleotides in length and encodes a protein with 158 amino acid residues, of which 120 amino acids underlined in Figure 1 were identified directly by the protein sequencing and/or amino acid analyses of the peptides obtained from digestions with trypsin, S. aureus V8 protease and pepsin.
No N-terminal amino acid residue of the intact chain II was detected by Edman sequencing, suggesting that the N-terminus is blocked, as in the case of the two-domain chain of B. lima . So we isolated the N-terminal peptic peptide P-2, digested with acylamino acid releasing enzyme and sequenced. Thus we confirmed that the initiation Met is cleaved away, and the mature protein begins with the blocked Ser. The molecular mass for the protein was calculated to be 17605 Da.
The cDNA-derived amino acid sequence of B. virescens heterodimeric chain II was aligned with those of B. virescens chain I  and B. lima subsp. tetrameric α β chains (Suzuki et al., in preparation) in Figure 2, using the algorithm of Feng and Doolittle . The percent identity between the chains is summarized in Table 1. The chain II shows the highest homology (42%) with chain I, but shows lower homology (32–35%) with each of the α and β chains of B. lima. This sequence homologies indicate that B. virescens chains I and II do not correspond to B. lima α and β chains. Namely the heterodimeric hemoglobin is a unique gene product expressed only in B. virescens.
Matrix for sequence homologies (percent identity) between the polypeptide chains of B. virescens and B. lima hemoglobins
In Anadara (Scapharca) tetrameric and homodimeric hemoglobins, E and F helices are shown to participate in important inter-subunit interactions . Consistent with this, three constituent chains of Anadara hemoglobins are highly-homologous especially in E and F helices: 24 out of 30 residues are identical as shown in Figure 2. The overall similarities between Anadara chains are 48–55%. If we compare the amino acid sequences of B. virescens chains I and II with that of Anadara consensus sequence , only 13–14 residues (underlined in Fig. 2) are conserved. Thus it is unlikely that B. virescens heterodimeric hemoglobin forms a “back-to-front” subunit assembly as in the case of Anadara hemoglobins.
Why are the different hemoglobins expressed in closely related species B. virescens and B. lima? We suggest that this is due to the reason that the blood clam hemoglobin plays physiologically less important role compared with that of vertebrate hemoglobins. Most of bivalves do not express hemoglobins , but they must have retained hemoglobin genes probably as a pseudo-form. Thus, various type of clam hemoglobins, such as those of homodimer, heterodimer, tetramer, two-domain globin and extracellular 18–20-domain globin , would be recognized as relics of molecular evolution.