Invasive insects present an ongoing challenge to the safety of U.S. agriculture. A current threat to the U.S. cotton industry is Oxycarenus hyalinipennis (Costa), commonly known as the cotton seed bug. Populations are found throughout most of the world except for North America, and the southeastern U.S. is believed to provide a favorable environment for its establishment. A major component in efforts to control the spread of invasive pests is the rapid and accurate identification of intercepted specimens. Unfortunately, O. hyalinipennis belongs to an incompletely characterized taxon where the assignment of species identity by simple morphological keys is often problematic. In this study, we assessed the potential of DNA barcoding to facilitate the identification of the cotton seed bug in field-collected specimens.
The genus Oxycarenus (Hemiptera: Oxycarenidae) is large and diverse in the Eastern Hemisphere, with approximately 50 known species, but only Oxycarenus hyalinipennis (Costa) or the cotton seed bug has been reported in the Western Hemisphere (Slater & Baranowski 1994). It is a major pest of plants in the order Malvales, with the most significant economic consequences on cotton (Gossypium spp.; Malvaceae) production. Damage occurs primarily from their feeding on mature seeds, which can lead to significant reductions in the yields of cotton seed oil (Sweet 2000). At higher infestations contaminating insect bodies will stain cotton lint, thereby adversely affecting marketability (Smith & Brambila 2008).
Oxycarenus hyalinipennis populations in the Western Hemisphere were first observed in South America (Slater 1964; Slater & Baranowski 1994), and there has been a continuing northward expansion since, with populations found in the Caribbean and the southern tip of Florida at Stock Island and Key West (Slater & Baranowski 1994; Baranowski & Slater 2005; Smith & Brambila 2008; Halbert & Dobbs 2010). It is designated a high-risk pest for the U.S., and favorable conditions for establishment exist in the southeastern states (Holtz 2006; Landry & Michalak 2006D0. Other crops besides cotton at risk include hibiscus (Hibiscus spp. L.), kenaf (Hibiscus cannabinus L.), and okra (Abelmoschus esculentus (L.) Moench), with the chemical treatments required for control likely to have adverse effects on the environment (Sweet 2000; Smith & Brambila 2008).
Monitoring for this pest in the United States is complicated by relatively incomplete and often ambiguous taxonomic descriptions. Oxycarenus hyalinipennis is a member of the infraorder Pentatomomorpha (superfamily Lygaeoidea; family Oxycarenidae) in which the phylogenetic relations at the level of superfamilies is considered uncertain (Li et al. 2005). The systematics of the genus Oxycarenus has been described as complex (Sweet 2000), with reports of substantial intraspecific polymorphisms and cross-hybridization between species (Bergevin 1932; Samy 1971; Segura et al. 2011). These issues make the accurate identification of O. hyalinipennis using existing morphological keys difficult, requiring substantial taxonomic expertise.
DNA barcoding has the potential to facilitate the identification of problematic taxa of agricultural importance (reviewed in Armstrong & Ball 2005). The method uses DNA sequence comparisons of a designated “barcode” region, typically a portion of the highly conserved mitochondrial Cytochrome C oxidase subunit I (COI) gene, with the assumption that barcode sequence variation within a taxonomic group will be significantly less than that observed between groups (Hebert et al. 2003gn). As a consequence there should be a specific correspondence between barcode sequences and taxa, an expectation observed in several taxa and successfully applied to identify unknown specimens (reviewed in Floyd et al. 2010).
However, there are potential problems to DNA barcoding when based on a single locus that can lead to ambiguous or inaccurate species identification (Meier et al. 2006; Whitworth et al. 2007; Wiemers & Fiedler 2007). The taxonomic reliability of current public barcode sequence databases is uncertain, with suggestions that as much as 20% of the sequences in GenBank are misidentified (Bridge et al. 2003; Nilsson et al. 2006). Furthermore, barcode data are available for only a small fraction of species, with many of these represented by only a single sequence. Poor taxon coverage brings into question whether current databases adequately represent such diverse groups as arthropods or the level of barcode variation within individual species. This is an issue for the Oxycarenus genus, where publically available DNA sequence information at the time of this writing is very limited. These all could contribute to why within the Pentatomomorpha, barcode comparisons alone did not resolve phylogenetic relationships in a manner concordant with morphological keys at or above the superfamily level (Li et al. 2005).
While these genetic and taxonomic issues make problematic the use of DNA barcoding for phylogenetic studies in Oxcarenus, current methods of species identification based on morphology are sufficiently difficult and labor intensive that even an imperfect genetic method could be useful. The goal of this study was to empirically assess whether DNA barcoding could facilitate current monitoring efforts of this important economic and invasive pest. The range of COI sequence variations within the O. hyalinipennis species collected from several regions was determined. Comparisons of these sequences with existing DNA databases were performed and their power to discriminate O. hyalinipennis from a sister species Oxycarenus laetus Kirby and related genera assessed. The advantages and limitations of DNA barcoding to improve current procedures of O. hyalinipennis identification and monitoring are discussed.
MATERIALS AND METHODS
Specimen Collections and Sites
Specimens examined from Florida (oxy7, oxy10), Puerto Rico (oxy8, OHy15), the Bahamas (oxy11), and Kenya (OHy1–6, OHy9, OHy10, OHy12, OHy14) were hand-collected from field cotton and stored in 90%–95% ethanol, while samples from Israel (oxy1, oxy2) and Brazil (oxy4– 6) were from archived pinned collections stored under ambient conditions (Table 1). Identification of field specimens as O. hyalinipennis was based on external morphology and genital structures (Brambila 2010).
DNA Preparation and Amplification of the COI Region
Molecular analysis of the specimens was performed in 2 laboratories. For the “oxy” series (Table 1), a mixture of genomic and mitochondrial DNA was isolated from individual specimens using Zymo-Spin III columns (Zymo Research, Orange, California) as described previously for fall armyworm (Nagoshi et al. 2010). Yields from the pinned specimens were at least 10-fold lower than that obtained for samples stored in ethanol as measured by UV absorbance (OD260; NanoDrop 2000, Thermo Fisher Scientific Inc., Wilmington Delaware). The O. hyalinipennis barcode region was first amplified by PCR with the 5′ primer derived from a sequence for the COI region reported for an unidentified Oxycarenus species (AY252929) and designated oxyCOI45F (5′- TCCGGATTGAACTGGGTCAAC-3′). The 3′ primer was obtained from the COI sequence for O. laetus (HQ908084) and designated oxyCOI637R (5′-AGGGTCACCTCCTCCTGTAGGGT-3′). PCR amplification was performed on a subset of the DNA samples using a 30-µL reaction mix containing 3 µL 10X manufacturer's reaction buffer, 1 µL 10mM dNTP, 0.5 µL 20-µM primer mix, 1 µL DNA template (between 0.005–0.5 µg), 0.5 unit Taq DNA polymerase (New England Biolabs, Beverly, Massachusetts). The thermocycling program was 94 °C (1 min), followed by 33 cycles of 92 °C (30 s), 54 °C (45 s), 72 °C (45 s), and a final segment of 72 °C for 3 min. Amplification products were analyzed and isolated by agarose gel electrophoresis where 6 µL of 6X gel loading buffer was added to each amplification reaction and the entire sample run on a 1.5% agarose horizontal gel containing GelRed (Biotium, Hayward, California) in 0.5X Tris-borate buffer (TBE, 45 mM Tris base, 45 mM boric acid, 1 mM EDTA pH 8.0). A major band corresponding to the expected size was obtained for oxy10-l, 10-2, 10-3, 10-4 and these were isolated and sequenced using the PCR primers. All 4 sequences were found to be identical to that of O. laetus. A new 5′ primer, oxyH18F (5′-GGTATATGATCCGGTATAGTTGG-3′), was derived from this data and used in combination with oxyH637R for subsequent PCR amplification of the O. hyalinipennis specimens.
OXYCARENUS HYALINIPENNIS SPECIMEN COLLECTION INFORMATION.
DNA from the “OHy” series (Table 1) were isolated using the DNeasy® Tissue Kit (QIAGEN Inc., Germantown, Maryland) for nucleic acid extraction as per manufacturer's instructions. Primers CO1490F (5′ GGTCAACAAATCATAAAGATATTGG-3′) and C02191R (5′-CCCGGTAAAATTAAAATATAAACTTC-3′) (Life Technologies™, Carlsbad, California) were used to amplify a 589-bp fragment from the COI gene (Folmer et al. 1994, Simon et al. 1994). Amplification of the COI gene was carried out in a 25-µL reaction mixture containing 2.5 µL 10X manufacturer's reaction buffer, 2 µL 10mM dNTP, 2 µL 10-µM primer mix, 4 µL DNA template (0.1–0.5 µM), 0.2 unit Taq DNA polymerase (TaKaRa Taq®, TaKaRa Bio Inc.). Amplifications were performed in a PCR-100 Thermocylcler (MJ Research Inc., Watertown, Massachusetts) with the following protocol: 92 °C (2 min), 2 “touchdown” cycles from 53 to 48 °C (10 s at 92 °C, 10 s at 53–48 °C, 1 min at 72 °C) followed by 29 cycles of 92°C (10 s), 47 °C (10 s), 72 °C (1 min), and a final segment of 72 °C for 5 min (Scheffer & Grissell 2003). Amplification products were analyzed using agarose gel electrophoresis where 1 µL of 6X gel loading buffer was added to each amplification reaction and the entire sample run on a 1% agarose horizontal gel containing Ethyl Bromide (10mg/mL) (Sigma-Aldrich, St Louis, MO) in 1X Tris-Acetate-EDTA buffer (TAE, 40 mM Tris-acetate, 45 mM glacial acetic acid acid, 1 mM EDTA pH 8.0). PCR mixtures were purified using Exonuclease I and Shrimp Alkaline Phosphatase (USB®, Affymetrix Inc., Cleveland, Ohio) according to manufacturer's instructions. The purified PCR products were then prepared for DNA sequencing with the BigDye Sequencing Kit, Terminator 3.1® (Applied Biosystems®, Cleveland, Ohio).
DNA Sequence Analysis
DNA sequencing was performed on the “OHy” (USDA-ARS-Systematic Entomology Laboratory, BARC, Beltsville, Maryland) and “oxy” (University of Florida Interdisciplinary Center for Biotechnology Research) PCR products using the amplification primers. The quality of the sequence data was confirmed by examination of the chromatographs. Haplotypes obtained in this study have been deposited in GenBank (accession nos. JQ342987 and JQ342988). Voucher specimens from Florida, Puerto Rico, and Brazil are at the Florida Department of Agriculture and Consumer Services, Division of Plant Industry, Florida State Collection of Arthropods, Gainesville, Florida.
A BLAST search was performed using the NCBI GenBank nucleotide collection (nr/nt) database with OH1 as the query sequence. Initially, 9 sequences were identified with nucleotide identity greater than 99% (Table 2). These sequences were subsequently removed from GenBank because the entries did not meet the minimum data standard, but could still be accessed from the Global Mirror System of DNA Barcode database (GMS-DBD, http://bold.ala.org.au/index.php/home/). The original GenBank accession numbers and corresponding GMS-DBD sample numbers are as follows (GenBank:GMS-DBD); GU681972:IMB-00042, GU681989:IMB-00024, GU681986:IMB-00028, GU681970:IMB-00045, GU681974:IMB-00040, GU681993:IMB-00044, GU681967:IMB-00046, GU681965:IMB-00048. The OH haplotypes were defined by comparisons of a 519-bp region from +133 to +651 shared by these sequences and those listed in Table 1.
BARCODE SEQUENCE INFORMATION.
Barcode sequences were also identified with nucleotide identity less than 99%, but greater than 85%. Within this subgroup, representative sequences with highly significant E-values (E < e-156) were selected for further comparisons. The E-value is a parameter calculated by the GenBank BLAST program to assess the significance of each homology match such that the closer the E-value approaches zero, the less likely the match was due to chance ( http://www.ncbi.nlm. nih.gov/BLAST/tutorial/Altschul-l.html). DNA alignments were performed using Geneious Pro (Drummond et al. 2010) and the CLUSTAL algorithm. A 500-bp segment from +133 to + 632 was shared by all sequences in Tables 1–2 and was used for the DNA variation and phylogenetic comparisons. Descriptive DNA sequence statistics and calculations of nucleotide variation based on the Jukes-Cantor (JC) model were performed using DNAsp Ver. 5.1 (Librado & Rozas 2009). Sequence divergences among individuals were calculated using the Kimura-2- Parameter distance model (Kimura 1980) and graphically displayed in a neighbor-joining (NJ) tree (Saitou & Nei 1987). Confidence was assessed by bootstrapping at 2000 replications.
DNA sequence information from a total of 35 O. hyalinipennis specimens was obtained for a 519-bp portion of the COI gene that is frequently used for barcoding comparisons (Fig. 1). Only 2 polymorphic sites were identified, each associated with base substitutions that together produced 3 haplotypes (OH1-OH3; Table 3). The OH1 haplotype was the most common with representatives in all locations. OH2 was found in 6 of 7 specimens collected from Puerto Rico, while the OH3 haplotype was found in 1 of 10 specimens collected from Kenya. Polymorphism frequency analysis of the 35 samples found low nucleotide diversity (π = 0.0007) despite the wide geographical and temporal ranges of the collections. As a comparison, a similar analysis of an overlapping barcode segment in Florida populations of the Noctuid moth Spodoptera frugiperda (J. E. Smith) gave a value 100-fold higher (π = 0.07; Nagoshi et al. 2011).
A blast search of the NCBI GenBank database uncovered 9 sequences with > 99% nucleotide identity with OH1, one identified as O. laetus collected in India and 8 from hemipteran samples of unknown species from Pakistan (Table 2). The O. laetus and 3 of the Pakistan samples were identical to OH1, while the remaining 5 displayed single base changes at 1–2 sites and were designated haplotypes OH4-OH8 (Table 3). All 8 OH haplotypes contained an open reading frame encoding for the same conceptual 172-residue amino acid sequence, which shares > 90% identity to other hemipteran cytochrome C oxidase subunit I proteins. This strongly suggests that the OH haplotypes represent alleles of the active mitochondrial gene rather than nuclear pseudogenes.
The remaining DNA sequences found by the BLAST search had less than 90% nucleotide identity to OH1. One was from an unspecified Oxycarenus species (E = 5.64e-180, nucleotide identity = 86%). We chose an additional 10 sequences that were representatives of the 10 genera that displayed the highest sequence similarity to OH1 (all with sequence identity > 85% and E < e-156; Table 2). High bootstrap support (100%) was found for the clustering of the OH haplotypes, and these displayed a closer relationship (84% bootstrap support) to the barcode of the unspecified Oxycarenus species (AY252929) than to that of the other genera (Fig. 2).
SEQUENCE POLYMORPHISMS IN THE COI REGION (+133 TO +651) THAT DEFINE THE OH HAPLOTYPES (*, SAME AS CONSENSUS SEQUENCE).
To estimate the range of barcode variation in O. hyalinipennis, 35 specimens were examined from dispersed sites in both hemispheres with collection dates ranging from 1984 to 2011. Each was classified as O. hyalinipennis using established morphological keys (Brambila 2010; Samy 1969). The barcoding results attested to the consistency of the identifications, as there was little genetic variability between the specimens. When combined with near-identical sequences found in public DNA databases, a total of 8 haplotypes were obtained that appear to be representative of the O. hyalinipennis group (Table 3).
However, an ambiguity arose from the observation that the OH1 barcode sequence is identical to that reported for O. laetus for the 519-bp COI fragment analyzed. This strongly suggests that the COI region is sufficiently similar between O. laetus and O. hyalinipennis that even additional sequencing of the locus is unlikely to be conclusive in distinguishing these 2 species. DNA sequence identity in the barcode region between representatives of 2 species is unexpected, but not unprecedented. It was found in a survey of Diptera that any given barcode sequence had a 6% chance of being associated with more than 1 species and that 21% of species had consensus barcodes that were not specific (Meier et al. 2006). These barcoding anomalies were largely attributed to taxon identification errors within the GenBank database and to the large number of species surveyed that were represented by only a single barcode (Virgilio et al. 2010) The latter do not allow for conspecific verification or indicate the level of sequence variation within the individual species. These reservations are relevant to the O. laetus COI sequence as it appears to be derived from a single location (Habeeb & Sanjayan 2011).
We also cannot preclude the possibility that O. laetus and O. hyalinipennis may be of the same species, particularly given similarities in behavior and host plants (Samy 1969; Sweet 2000). While large differences in color markings can be observed between adults of the 2 groups (J. Brambila, personal communication), there is precedence within the genus for intraspecies variability leading to classification errors. Morphological differences initially led to O. gossipinus Distant, O. annulipes (Germar), and O. fieberi Stâl being classified as separate species, but these have since been consolidated into a single species (O. multiformis Samy) after demonstration of substantial cross-hybridization (Samy 1971). In an analogous manner, the same study noted that O. hyalinipennis and 2 other species, O. nigricornis Samy and O. pallidipennis (Dallas), “are separated by some minor differences in the coloration of the antennae and venter in spite of the close identity of their male genitalia and their coexistence in Kenya, South Africa and Uganda” (Samy 1971). Oxycarenus hyalinipennis has also been reported to cross-hybridize with O. lavaterae (F.), though the frequency and biological relevance of such occurrences in the field are not known (Bergevin 1932; Segura et al. 2011). Given these observations, in the absence of similar cross-hybridization studies between O. laetus and O. hyalinipennis their identification as separate species should probably be considered preliminary.
It is apparent that far more extensive genetic characterizations of O. laetus and O. hyalinipennis and the rest of the genus are required for accurate species classification using DNA barcoding. This would include additional sequence data from the COI gene, sequence data from other mitochondrial or nuclear genetic markers, and in each case a more representative sampling of field populations of O. laetus and other related species. Obtaining such samples may be difficult when dealing with foreign species, as is typically the case with invasive pests. Furthermore, as species identification methods become more complex, as would be the case if multiple loci have to be sequenced and compared, then its application to routine pest monitoring becomes less practical and cost-effective. These reservations generally hold for most invasive pests of concern. While there is a rapidly growing expansion of DNA sequence information the barcode regions of the vast majority of arthropod species are uncharacterized, and are likely to remain so for the near future. This means that the problems associated with poor taxon coverage in DNA databases and incomplete taxonomic descriptions will continue to be technical hurdles facing the routine application of DNA barcode-based methods for identifying invasive pests.
Nevertheless, we believe DNA barcode analysis based on a small region of the COI gene can be useful for monitoring invasive Oxycarenus species as a preliminary indicator of a potential intercept. Even with current information, a field-collected specimen with a barcode that falls within the OH clade should be considered a stronger candidate for O. hyalinipennis than one that does not. In addition, the general usefulness of this approach can be improved by shifting the objective away from precise species identification, which requires extensive information of many taxa, to the more limited one of testing for a lack of correspondence to rule out the possibility of specific species of relevance (see Virgilio et al. 2010). For example, species native to Florida with similar morphology to O. hyalinipennis, have overlapping host range, are found in the same traps, or could otherwise be mistaken for the invasive pest, can be analyzed to produce a comprehensive domestic barcode database. This would have the advantages of requiring DNA sequence information for a relatively small number of species that should be more accessible than foreign populations. If the barcode of an unknown specimen shows significantly closer similarity to the OH haplotypes than to the domestic barcode database, it would suggest an invasive Oxycarenus intercept and justify further action. We note that the long-term trends of improving efficiency in DNA sequence technology, higher demands for invasive pest monitoring, and rapidly expanding COI sequence databases should make the application of DNA barcoding for routine monitoring increasingly attractive.
In summary, our results demonstrate the limitations and potential usefulness of DNA barcoding to monitor for the entry of invasive Oxycarenus cotton pests into North America. Thus DNA barcodes provide a potentially useful complement to existing morphological criteria. These results justify a more extensive cataloguing of DNA barcodes from different Oxycarenus and related species and ecotypes to enhance the resolution of the barcoding method and to potentially provide new insight into phylogenetic relationships.
We thank Dr. Leroy Whilby (DOACS-DPI-CAPS), Dr. Wayne Dixon (DOACS-DPI-CAPS), and Dr. Susan Halbert (DOACS-DPI-CAPS) for originating and bringing this project to our attention, and also for helpful comments on this manuscript. We also thank Dr. Richard Brown (Mississippi Entomological Museum), Tad Dobbs (USDA), Andrew Derksen (DPI-CAPS), Karolynne Griffiths (USDA-CAPS), and Dr. Amy Roda (USDA-APHIS-CPHST) for providing material for this study. We also thank Matthew L. Lewis (USDA-ARS-BARC) and Dr. Sonja Scheffer (USDA-ARS-BARC) for technical assistance. The use of trade, firm, or corporation names in this publication is for the information and convenience of the reader. Such use does not constitute an official endorsement or approval by the United States Department of Agriculture or the Agricultural Research Service of any product or service to the exclusion of others that may be suitable.