We used mitochondrial (12s, 16s) and nuclear (18s, 28s) ribosomal gene sequences to derive a phylogeny of the Eumastacoidea, with the aims of a) clarifying the position of the Proscopiidae with respect to the Eumastacoidea b) testing the phylogenetic hypothesis and classification advanced by Descamps 1973b for the Eumastacoidea, and c) deriving a time scale for the phylogeny based on molecular clock calculations. Four different analysis methods were employed: maximum parsimony, neighborjoining assuming minimum evolution, maximum likelihood and Bayesian analysis. The genes were analysed separately and after concatenation. The sample included 6 of the 7 families and 12 of the 31 subfamilies of the Eumastacoidea, and three proscopiids. We included tetrigoids (as outgroup) and tanaoceroids and trigonopterygoids to provide polarity.
No analysis supported placing the Proscopiidae within any of the existing branches of the Eumastacoidea. Some placed the two taxa as sistergroups within a Eumastacoidea s. lat., and some indicated that they are separate superfamilies. We cannot distinguish between these two possibilities with the present data.
Within the Eumastacoidea s.str. all of the groupings of subfamilies (i.e., families) proposed by Descamps were well-supported. Higher nodes of the phylogeny were in general only weakly supported. Descamps' suprafamilial groupings appeared in some but not all analyses. Of these groupings, the Cryptophalli (=Chorotypidae plus Episactidae) were not well supported, the Stenophalli (Eumastacidae plus Morabidae) were reasonably well supported, while the Disclerophalli (=Thericleidae plus Euschmidtiidae) were strongly supported, and additionally the Gomphomastacinae were associated with it. (The Euphalli, containing only the Indian family Mastacideidae, were not included in the analysis.)
The sequence data did not allow the assumption of a molecular clock, and for this reason the nodes of the phylogeny could not be dated.
Introduction
The Eumastacoidea are a superfamily of the Orthoptera Caelifera, almost worldwide in distribution, but predominantly tropical and entirely absent from Europe, New Zealand and Antarctica. They have long been considered a relatively early branch of the Caelifera, a view confirmed by molecular systematic investigations which place them after the Tridactyloidea and Tetrigoidea but before the remaining superfamilies (Flook & Rowell 1997, Rowell & Flook 1998, Flook et al. 1999). Like the more familiar Acridoidea they are subaerial herbivores of higher plants, but differ from them in numerous aspects of morphology, of which the superficially most obvious are the absence of an abdominal tympanum, small size, wings (when present) widening towards the tip, and, in most taxa, very short antennae and a laterally spread posture of the jumping hind legs when at rest.
Currently the superfamily (excluding the Proscopiidae, see below) contains 295 genera and 1103 species (totals derived from Otte, Eades & Naskrecki 2003). This represents about 10% of the entire Caelifera, making them the largest superfamily after the Acridoidea. Since their original recognition as a systematic group by Stål (1876), the eumastacids have been the subject of several major revisions, which have taken them to family and then to superfamily status and produced an ever-increasing number of subfamilies (Karsch 1889; Brunner 1893, 1898; Burr 1899, 1903; de Saussure 1903; Bolívar 1930, 1932; Rehn and Rehn 1934, 1939, 1942, 1945; Rehn 1948; Dirsh 1961, 1975; Descamps 1973b). The most recent of these (despite the chronologically later appearance of Dirsh's last work) is that of Descamps 1973b, who recognized 4 groupings of families, 7 families and 31 subfamilies (Table 1), principally on the basis of the male genitalia.
Table 1.
Classification of the Eumastacoidea according to Descamps (1973a), and the geographical distribution of the subfamilies.
Descamps' classification, which was explicitly based on a hypothesis of phylogeny (Fig. 1), has been widely adopted, though some secondary authors have either reduced his families to subfamilies and the subfamilies to tribes (e.g., Otte 1994, Otte et al. 2003), or conversely have raised certain of Descamps' subfamilies to family rank (e.g., Kevan 1982). Some of his higher groupings of the taxa have been disputed by Amedegnato (1993). None of these authors however have given any reasons for their opinions, so Descamps' scheme can be taken as the current morphologically-based classification.
A much-disputed systematic question concerns the relation of the proscopiid grasshoppers (currently 39 genera and 266 spp., according to Otte, Eades & Naskrecki 2003) to the Eumastacoidea. While it has often been suggested (e.g., by Roberts 1941) and is generally accepted that these two taxa are each others' nearest relatives, their precise relationship is unclear. Descamps (1973a & b) considered them to be separate superfamilies, elevating the former group to the Proscopioidea. Other authors (e.g., Dirsh 1966e.g., Dirsh 1973; Otte et al. 2003) have placed the proscopiids as a family within a superfamily Eumastacoidea s.l., demoting the Eumastacoidea sensu Descamps 1973b to family rank; in phylogenetic terms this usage implies that the two taxa share a unique common ancestor and are sister-groups within a single clade. A third possibility is that advocated by Amedegnato (1993), who proposed that the Proscopiidae are a branch of the eumastacid Cryptophalli; this would imply that they have a phylogenetic position within the Eumastacoidea sensu Descamps, and that the latter are paraphyletic with respect to the Proscopiidae. The resolution of this question has major implications for the formal classification of the group, as it results in the Eumastacoidea containing seven, two or only one families.
We have used mitochondrial and nuclear ribosomal gene sequences to derive an independent phylogeny of the Eumastacoidea s.l., with the aims of a) resolving the relationship between the proscopiids and eumastacids, b) testing the phylogenetic hypothesis (and thus the classification) advanced by Descamps 1973b for the Eumastacoidea, and c) deriving a time scale for the phylogeny-based on molecular clock calculations. Interest attaches to the last subject because of the biogeography of the modern Eumastacoidea and the fact that the earliest fossil eumastacoid is of Jurassic age (Sharov 1978), implying that existing lineages may have been originally separated by the break-up of Pangaea and subsequently of Gondwanaland in the Mesozoic period. This hypothesis could be tested by a dated phylogeny. The scope of our project was unfortunately limited by the availability of fresh or suitable alcohol-preserved material; we were unable to derive adequate DNA for our purposes from dried museum specimens. As a consequence, our investigation is limited to representatives of 6 of the 7 families and 12 of the 31 subfamilies. This is however adequate to examine many features of Descamps' classification and to clarify the position of the Proscopiidae. Most of the results have been previously presented by Matt (1998) in an internal publication.
Materials and methods
The taxa used in this investigation are listed in Table 2. In addition to Eumastacoidea and Proscopioidea, we included representatives of the more basal superfamily Tetrigoidea (for use as the outgroup) and of the more derived superfamilies Tanaoceroidea and Trigonopterygoidea (to allow examination of the relationship of the Eumastacoidea to the Proscopioidea).
Table 2.
Taxa used in this study.
Table 2.
(contin.)
Specimens were collected into several changes of 95% ethyl alcohol and kept at approximately 5°C prior to DNA extraction.
We sequenced parts of the 12s and 16s mitochondrial ribosomal genes, and the complete 18s and 28s nuclear ribosomal genes. 12s and 16s sequences were obtained for all the 39 taxa listed in Table 2, and 18s sequences for 23 of these. 28s sequences were obtained for 21 taxa; sequences from Trigonopteryx and Tanaocerus were not obtained for this gene.
The laboratory methods for 12s & 16s sequences have been presented in detail by Flook and Rowell (1997a). In brief, fragments of the mitochondrial 12s and 16s ribosomal RNA genes were amplified by PCR and both strands sequenced. The sequences were aligned and ambiguous portions rejected. For further details and rationale see Flook and Rowell (1997a, 1997b) and Matt (1998).
The alignment can be obtained on request from the authors.
Phylogenetic analysis was carried out using the programmes PAUP* 4.0 (version Beta10 and previous versions) (Swofford 2000) and MrBayes (Huelsenback & Ronquist 2002). A ratio of 2:1 was used for weighting transversions against transitions, following our 1997 work. Analyses were made using maximum parsimony, neighborjoining (minimum evolution), maximum likelihood (ML) and Bayesian probability methods. The sequences of the four genes were analysed separately and also after being concatenated in a “total evidence” approach.
As outgroup for the main analysis we used a batrachideid tetrigoid, representing a superfamily which we have previously shown (Flook & Rowell 1997a, Flook et al. 1999) to be the sister group of the Eumastacoidea plus all later Caelifera. A larger sample of 10 tetrigoids was used as the outgroup for a subsidiary analysis of the 12s+16s dataset only.
The admissibility of molecular clock assumptions was tested by comparing the log likelihood scores of ML trees run with and without clock constraints, treating twice the difference between the scores as a value of X2 with N-2 degrees of freedom, where N is the number of taxa in the tree.
Results
1. Sequence statistics
12s+16s:
The initial length of the combined sequences was 957 bp, of which 395 bp were derived from the 12s gene and 562 from the 16s gene. 173 bp were excluded from the analysis as being ambiguously aligned, leaving a final length of 784 bp. Of these, the composition was A=0.33054 C=0.10688 G=0.17602 T=0.38656.
18s:
The initial aligned sequence consisted of 3243 bp. Of these, 1331 positions were excluded, either because there were gaps in all the species we used or because of ambiguous alignment, leaving 1913 bp for analysis. Their composition was A=0.24434 C=0.23283 G=0.27625 T=0.24659.
28s:
The initial aligned sequence consisted of 2363 bp. Of these, 81 positions were excluded because of ambiguous alignment, leaving 2282 bp for analysis. Their composition was A=0.20903 C=0.26934 G=0.32782 T=0.19381.
Combined sequences:
Of the 6564 total characters, 5201 characters are constant, 540 variable characters are parsimony-uninformative, 824 are parsimony-informative. 1585 characters were excluded, leaving 4979 for analysis. Their composition was A=0.24463 C=0.22613 G=0.28007 T=0.24917.
The 12s+16s sequences contained the most phylogenetic signal, as can be seen from the evaluation of random trees generated from the data (Table 3). The 18s dataset contained comparatively little signal, and the 28s sequences are intermediate.
Table 3.
Evaluation of sets of 10000 random trees generated from the sequences of the three genes.
2. Phylogenetic reconstruction
Corresponding to the above analysis, the 12s+16s sequences produced relatively well-supported trees, whereas the 18s sequences yielded poorly resolved trees with minimal bootstrap or posterior probability support. The 28s sequences were intermediate between these two extremes. However, those groupings which the nuclear sequences did suggest, were often concordant with those derived from the 12s+16s sequences. As might be expected from the above, analyses of the combined sequences yielded results most similar to those obtained from the 12s+16s sequences alone.
Likelihood ratio analysis indicated that the data were best fitted by the general time-reversal model with gamma correction (GTR+G), and this model was used in Maximum Likelihood and Minimum Evolution analyses.
The different methods of analysis yielded similar but not identical results. In general, the highest levels of branching (corresponding to the superfamilies) are moderately well supported and resolved, and the lowest ones (corresponding to families and subfamilies) very well supported, whereas the intermediate level branches, corresponding to the branching order of the different clades of the Eumastacoidea, were not at all well supported. Most of the differences between the various trees are due to instability at this intermediate level.
Figures 2–5 show the trees produced from the various datasets by Maximum Likelihood methods, with the support adduced by the remaining methods indicated at each node (in turn, Bayesian posterior probability, Minimum Evolution nonparametric bootstrap values, and Maximum Parsimony nonparametric bootstrap values). (The probabilities associated with the Bayesian analysis were usually higher than those yielded by other methods, agreeing with the findings of Suzuki et al. (2002)). Important differences in topology between the various methods are also indicated. Branches which are “obviously wrong” — i.e., conflict with the placings of the relevant taxa according to both the other genes and morphology — are shown with dotted lines.
3. Phylogenetic reconstruction from 12s+16s sequences only for a larger sample of Eumastacoidea
For this analysis we used all eumastacoids for which we had 12s and 16s sequences, including some for which the sequences of the other genes were lacking, and included 10 tetrigids in the outgroup. The result of a Bayesian analysis is shown in Fig. 6. It differs from the 12s+16s analysis of the smaller sample in two important aspects: the proscopiids are placed as a separate superfamily (Proscopioidea), and the Chorotypidae are not placed basally within the Eumastacoidea. In other respects the two analyses are similar.
4. Justification of a molecular clock assumption
The difference between the log likelihoods of clock-constrained and unconstrained ML trees was always very highly significant. This applied to all the gene sequences, either alone or in combination. It was also true after the trigonopterygoid and tanaoceroid taxa were excluded from the sample, leaving only the Eumastacoidea s.l., and after the proscopiids too were excluded, leaving only the Eumastacoidea s. str. The assumption of a molecular clock is clearly unjustified for these data.
Discussion
A. The status of the Proscopiidae
As described in the introduction, there are basically three proposals for the position of the Proscopiidae: 1) as an independent superfamily, the Proscopioidea, as most recently proposed by Descamps 1973a; 2) as the sister group of the Eumastacoidea sensu Descamps 1973b; 3) contained within the Eumastacoidea sensu Descamps 1973b. Our analyses (Fig. 7) most often support the first of these proposals, less frequently the second, but never the third. We therefore reject with some confidence Amedegnato's (1993) suggestion that the Proscopiidae are an early branch of the Cryptophalli, but cannot distinguish between the first two possibilities.
B. Relationships within the Eumastacoidea sensu Descamps.
1. The Cryptophalli.
Descamps' Cryptophalli include two families, the Episactidae and the Chorotypidae. Our analysis recovers both these families, with strong support. However, we have no good molecular evidence that they are closely related, despite their apparently convincing morphological resemblances. Only the analysis of 28s sequences puts the two families as sister-groups, and with merely 43% posterior probability support. The tree yielded by the concatenated genes puts the Chorotypidae as the most basal eumastacid lineage, with the Episactidae the second most basal. The 12s+16s tree also has the Chorotypidae basal, but the Episactidae derived, as sistergroup to the Stenophalli, and the 18s tree does not retrieve the Chorotypidae at all.
Dirsh (1975) and Amedegnato (1993) have suggested that the Central American and Caribbean Episactidae are instead most closely related to the predominantly S. American Eumastacidae. Only the 12+16s tree, which links the Episactidae to the Stenophalli, lends any support to this proposal, and it does not show a sister-group relation between the Eumastacidae and the Episactidae. Rowell & Perez (2006) recently found no morphological support for linking the Episactidae with the Eumastacidae.
1.1. The Chorotypidae
Our sample includes only two Chorotypid genera, Erucius and Erianthus, and these, as expected, group as a clade, confirming the morphological evidence that their two subfamilies (Eruciinae and Erianthinae) are related. It is unfortunate that we were unable to include any species of the West African genus Hemierianthus, which is considered to be the only non-Asian representative of the family.
1.2. The Episactidae
Descamps' Episactidae (Table 1) include four subfamilies, the Episactinae (C. America), Espagnolinae (Hispaniola), Teicophryinae (Mexico) and Miraculinae (Madagascar). At the time of Descamps' writing, the Espagnolinae were based on only a single species, Espagnola darlingtoni Rehn & Rehn. Since then, the description of further Hispaniolan genera and of adults of Antillacris explicatrix Rehn and further species of this allegedly (Rehn 1948) episactine genus (Perez et al. 1997a, b; Perez & Rowell 2006) has made it seem probable that all the Hispaniolan eumastacids form a clade (the Espagnolinae) and are more distantly related to the Central American Episactinae. This hypothesis is supported by cladistic analysis of their morphology (Rowell & Perez 2006), and by our present analysis, which places the Hispaniolan genera Espagnola, Espagnolopsis, Antillacris and Tainacris together in a single clade, with the Central American genus Episactus as its sister group (Fig. 5). We have no molecular evidence bearing on the Teicophryinae or the Miraculinae.
2. The Stenophalli
1. Descamps' Stenophalli comprise two families, the predominantly South American Eumastacidae and the Australian and Papuan Morabidae (Table 1. Fig. 1).
Our analysis recovers both these families (although it excludes the Gomphomastacinae from the Eumastacidae, see below). The 28s analysis places the two families next to each other in branching order, but positions them as the most basal eumastacids, which is morphologically improbable. The 12s+16s analyses and the concatenated sequences place them as sistergroups within a single clade corresponding to the Stenophalli. This result supports Descamps' hypothesis and conflicts with that of Amedegnato (1993), who disputed a close relationship between these two families and proposed that the concept Stenophalli should be abandoned. The implication of the present-day distribution of the Stenophalli is of course that the lineage was established in West Gondwanaland before the mid-Cretaceous break between South America and Antarctica-Australia.
2.1. The Morabidae
The Biroellinae of New Guinea and Australia were first linked to the endemic Australian Morabinae by Dirsh (1956), who however, later (1975) changed his mind. Our analysis resolves the two taxa as sistergroups within a unique clade, supporting Dirsh's earlier view and that of Descamps. The present-day distribution of the family makes it probable that it originated in Australia. The current preponderance of biroelline species of relatively primitive morphology in New Guinea is probably due to the reduction of suitable habitat (rainforest) in Australia in geologically later times; the only modern Australian species are found in the wet forests of Queensland. The exclusively Australian Morabinae have adapted to other, drier, habitats.
2.2. The Eumastacidae
The close relationship of the tropical South American subfamilies subsumed by Descamps under the Eumastacidae has never been questioned, and it is supported in our analysis, in which the Eumastacinae and the Temnomastacinae are resolved as sistergroups.
Descamps also included within this family three other subfamilies of more obscure relationships — the Cuban Masynteinae, first linked with the Eumastacinae by C. Bolívar (1930), the Argentinian and N. American Morseinae, and the Central Asian Gomphomastacinae. Of these three, our sample includes only the last, represented by the genus Phytomastax. The Gomphomastacinae have been difficult to place on morphological grounds. Rehn (1948) linked them to the Morseinae, Dirsh (1975) saw similarities with the Morabinae and Eumastacidae (both authors thus foreshadowing Descamps), and Amedegnato (1993) has proposed that they are related to the Mexican Teicophryinae and the South American Proscopiidae (see A. above). Our analysis does not support any of these hypotheses, but instead consistently links the Gomphomastacinae with the African Disclerophalli (see next section).
3. The Disclerophalli
Descamps' Disclerophalli included two families, the Thericleidae and the Euschmidtiidae. The latter family was split from the former by Rehn (1948). These African eumastacids have been thought of as a unitary group by all workers ever since the erection of the group Thericleis by Burr (1899). Our analysis confirms this view: in the 12s+16s analysis and the concatenated sequences, the two families are resolved as sister groups within a clade corresponding to the Disclerophalli. A surprise finding is the appearance of the Gomphomastacine genus Phytomastax as a basal member of the Disclerophalli clade, a position not previously suggested by any author nor supported by any known morphological character set, but which occurs in all our analyses.
3.1. The Thericleidae and the Euschmidtiidae
Our sample is too restricted to throw any light on the internal relationships within these families, other than to confirm that the Plagiotriptinae and the Thericleinae both cluster within a clade (Thericleidae) as expected. The Seychellian Euschmidtia cruciformis is confirmed as being a close relative of the mainland East African species of this genus (Fig. 5), emphasizing the surprising dispersal abilities of this flightless group.
3.2. The Gomphomastacinae
On the basis of modern biogeography, which in other taxa of the Eumastacoidea is quite a good indicator of relationships, one would expect the Gomphomastacinae to be most nearly related to the Asian Chorotypidae. In the Upper Jurassic and Lower Cretaceous the Shan-Thai and Indo-China plates were not far removed from the Khazakstan plate (Van der Voo et al. 1999) and there was contiguous land between them. The later (Upper Cretaceous) arrival of the Indian plate (Rage et al. 1995) and consequent rise of the Himalaya would explain the modern distribution of the taxon over these regions. A relationship with the African families is harder to explain through palaeogeography, but the existence of a modern West African genus of Chorotypidae (see above) indicates that it may not be impossible.
4. Prospects for further analysis
Within the limitations of the systematic sample we used, the taxon sample size, the number of separate genes analysed, the total number of base pairs and the number of informative sites included, all seemed a priori to offer a good chance of producing well-resolved trees. While the analysis has produced good independent support for all the subfamily and family level taxa, we were unable to obtain well-supported topologies that indicate the higher relationships within the Eumastacoidea. This suggests that other genes, possibly a considerable number of them (see Rokas et al. 2003), will be needed to improve on the present analysis.
Acknowledgments
We extend our very sincere thanks to all the collectors listed in Table 2, without whose generosity and time the study would not have been possible. We also thank Urs Stiefel for technical assistance. Supported in part by the Swiss National Science Foundation.