Delimitation of species and identification of specimens to the species level continue to be difficult problems for practising entomologists, particularly those in tropical countries who often have no access to the holotype specimens or original literature of their local fauna. As a first step to the development of accurate Web-based species identification keys for Indian gryllids, we have examined the utility of morphological and song characters in correctly delineating species boundaries among 4 sympatric species of tree crickets of the Genus Oecanthus. Using a numerical taxonomic approach, phenetic clusters and ordinations were constructed on the basis of morphological and song characters. Quantitative and qualitative morphological characters were analysed independently and the results compared. The efficacy of the clustering and ordination techniques in species delimitation was examined by both internal and external allocation of individual specimens. Both the delimitation of species and the allocation of new specimens were 95 to 100% accurate using song or qualitative morphological characters. Quantitative morphological characters could also accurately delimit species, provided a large number of characters were used, irrespective of the specific characters chosen. For quantitative morphological characters, ordination was found to be more accurate than cluster analysis, both for delimiting species and in the allocation of new specimens.
Although the classification and identification of species has been the subject of active inquiry for centuries, accurate taxonomic identification to the species level continues to pose a difficult problem to practising entomologists, particularly those working in the tropics. Crickets (Suborder Ensifera) of the Indian subcontinent, provide a good example of a taxonomic group in which identification to the species level is difficult. In the case of the family Tettigoniidae, there are no taxonomic keys or comprehensive monographs on the Indian fauna (Ingrisch & Shishodia 1998). The keys for taxonomic identification of the Indian Gryllidae are provided in Chopard's “Fauna of India and the Adjacent Countries” (1969). Although this is a monumental treatise on the subject, the taxonomic keys in this work are often not sufficient for unambiguous assignment of specimens to the species, and sometimes even the generic, level. The reasons for this are briefly outlined below.
The first reason is imprecise character definition which, together with a paucity of illustrations to effectively convey the exact nature of the defined characters, makes it difficult to follow either keys or descriptions. The species descriptions are brief and rarely exhaustive and suffer from an inconsistent inclusion of characters, even among descriptions of very similar species. The second reason lies in the completely hierarchical, dichotomous structure of the keys which, combined with the ambiguity in the definition of some of the distinguishing characters, makes it difficult to follow the keys correctly.
The third reason is the lack, in some cases, of sufficient sample sizes and of objective criteria for defining an individual specimen as the standard reference holotype for a species: holotypes have sometimes been designated without examination of a sufficient number of specimens to take into account inter-individual variability in different characters. For example, the discrimination between some species of crickets (Chopard 1969) is made on the basis of differences in the number of tibial spines: there were several cases that we examined, however, where the difference in spine number between the left and right tibiae of the same specimen exceeded or equaled the variation in spine numbers between the designated species. In the absence of other distinguishing characters in the keys or descriptions, it was essentially impossible to assign such specimens to a given species.
These problems are further compounded for taxonomists in tropical countries by the inaccessibility of both the taxonomic literature on previous descriptions and revisions, and the reference specimens or holotypes, which are largely available only in museums in Europe and North America, as a result of a history of colonisation. Added to this is the global decline in the number of professional taxonomists, particularly for invertebrate animal groups (Gaston & May 1992).
A possible solution to some of these problems, now beginning to be implemented, is the development of Internet-accessible taxonomic databases (Godfray 2002, Mallet & Wilmott 2003). The Orthoptera Species File Online (Otte & Naskrecki 1997) is a useful first step in this direction. Although it provides valuable information on past literature and, increasingly, pictures of holotype specimens and recordings of songs, it is still not sufficient to make a taxonomic identification. One is made aware of the existence of past literature, but the actual works are largely unavailable. The pictures, although an important asset, do not on their own allow identification, since some of the key characters may not be visible or described in detail. In addition, what are urgently needed are databases that provide good taxonomic keys for unambiguous identification.
Our aim, in the long term, is to develop Web-based taxonomic keys for Indian gryllids, based on an extensive and detailed examination of several characters in a consistent and systematic manner. The keys would be either entirely or partially probabilistic and based on the evaluation of a sufficient number of individuals of each species. These would be verified against the voucher specimens (holotypes) of described species for historical continuity of nomenclature, but we envisage the consequent freedom from conventional holotype referencing for users of such a database. The holotype in the new system is envisaged as a virtual one which essentially represents the mean value of all measured characters (Mayr 1976). Voucher specimens (of both described and undescribed species) would serve as paratypes and the details of their morphology, both quantitative and qualitative, could be made available on the Internet.
As a first step towards this objective, we have adopted a numerical taxonomic approach (Sneath & Sokal 1973) to the classification and identification of 4 species of tree crickets of the Genus Oecanthus (Serville). Whereas the use of numerical taxonomy for the classification of higher level taxa is highly controversial (Ridley 1986), we believe that its methodology may be well suited to the problem of species identification and allocation, particularly in taxonomic groups that tend to be fairly homogeneous in their morphology. In contrast to classical taxonomic methods, numerical taxonomy uses a large number of characters in a consistent and systematic manner (Sneath & Sokal 1973). The techniques used take into account inter-individual variation in different characters in order to define the species clusters, rather than referencing inaccessible type specimens.
Our objectives in the following work were to 1) evaluate the accuracy and utility of a numerical taxonomic approach to species classification and allocation in a gryllid taxon and 2) to compare the effectiveness of song and morphological characters in obtaining correct species classification and allocation.
The tree cricket genus Oecanthus was chosen for this study because of the homogeneity of morphological characters in species of this genus, which was noted by Chopard (1932). It is represented by 4 currently described species (Oecanthus rufescens Serville, Oecanthus henryi Chopard, Oecanthus bilineatus Chopard and Oecanthus indicus Saussure) on the Indian subcontinent, all of which are syntopic in some regions of Southern India, including the region in and around Bangalore. The 4 species are, however, unambiguously defined by classical taxonomy based on diagnostic morphological characters (Chopard 1969, Otte & Alexander 1983) and we also have data from our own (unpub.) behavioral studies on reproductive isolation between them. The genus thus provides a group of related, morphologically similar but unambiguously classifiable species, making it a good system to evaluate the validity of a numerical taxonomic approach to species classification and identification.
Materials and Methods
Song recording and analysis.—The recordings of calling songs of the 4 Oecanthus species used in this study were made in the field using a Sony WM-D6C Professional Walkman and a Sony microphone ECM-MS957 (flat frequency response: 50 Hz to 18000 Hz). The microphone was placed at a distance of about 15 cm from the calling male. The ambient temperature in the vicinity of the calling male was measured after each recording with a thermometer (Kestrel 3000). We recorded the calling songs of 29 individuals of O. rufescens, 29 individuals of O. henryi and 33 individuals of O. indicus at temperatures ranging from 16° C to 28° C. Recordings of the 4th species O. bilineatus (10 individuals) were made at temperatures ranging from 22° C to 24° C, since the adults of this species were found mostly during the monsoon season, when the range of ambient temperature was small. Individuals were captured after recording their songs and their species identity verified on the basis of morphological characters according to Chopard (1969). The tapes of all of the recordings are stored at the Centre for Ecological Sciences (CES), Indian Institute of Science (IISc), Bangalore, India for reference.
The song recordings were digitized using a Creative Sound Blaster A/D Card at a sampling rate of 44 kHz for spectral and temporal analysis. Spectral analysis was performed using the signal processing software Spectra Plus Professional (1994, Version 3.0, Pioneer Hill Software, Poulsbo, WA). Temporal pattern analysis was performed using a custom-built program (Chandra Sekhar, ECE, IISc) in Matlab (1997, Version 188.8.131.521, The Mathworks Inc., Natick, MA) and the following song characters were measured: call duration, call period and syllable period.
Collection of specimens.—Ten males and 2 females of each of the 4 Oecanthus species were collected from the campuses of the Indian Institute of Science (IISc) and University of Agricultural Sciences (UAS), Bangalore, by S. Biswas, B.U. Divya, S. Swamy and R. Balakrishnan. These specimens were preserved in 70% alcohol for morphological studies. The collection is maintained at CES, IISc, Bangalore, India, for reference. In addition, 10 male specimens each of O. henryi and O. indicus, and 9 male specimens of O. rufescens, were collected from the same sites for use in external allocation (see section on statistical analysis below).
Morphological measurements.— A total of 54 morphological characters (Appendix) were used in this study, of which 12 were qualitative and 42 quantitative. Measurements of quantitative morphological characters were made using a binocular stereo zoom microscope (Labomed) with a graduated eyepiece. The resolution of the measurements was 0.1 mm.
Peg number.— The number of pegs (teeth) in the stridulatory file of each male specimen was determined by removing the right tegmen (forewing) and mounting it, ventral surface up, on a microscope slide. The pegs were then counted under an optical microscope (Olympus SZX12).
Wing area measurements.— The right tegmen mounted on the slide for the stridulatory peg count was used for measuring the wing area as well. The mounted wing was photographed using a digital camera (Olympus DP11) attached to the optical microscope. The areas of the mirror, dorsal field and lateral field of each right forewing were measured using the MapInfo Professional software program, (Version 5.0, Desktop Mapping Solutions, Inc.).
Metanotal glands.— The metanotal glands of all the male specimens were first carefully observed under a stereo zoom microscope (Labomed) to look for structures that could be used as additional characters for species identification. They were photographed using a digital camera (CoolSNAP-ProfCOLOR) under a stereo microscope (Olympus SZX12).
Scanning electron microscopy.— The stridulatory files were cut from the dried wing, attached to specimen stubs with carbon tape, placed in a high-vacuum evaporator, and coated with a layer of gold. The gold-coated files were then photographed with a scanning electron microscope (JSM-5600 LV, resolution 5 nm).
Oecanthus rufescens.—Lectotype, 1 ♀ (Museum National d'Histoire Naturelle, Paris, France). Male missing (L. Desutter Grandcolas, pers. comm.); morphological description matches that of the Australian O. rufescens (Otte & Alexander 1983) and so do all song features except the syllable repetition rate, which is higher in our species. The assignment of the O. rufescens in the current study (locality: Bangalore) to the same species as the Australian O. rufescens (Otte & Alexander 1983) must be regarded as tentative (L. Desutter-Grandcolas, pers. comm.).
Oecanthus indicus: The specimens in the current study were compared against paratypes and fit the original species description (L. Desutter Grandcolas, Museum National d'Histoire naturelle, Paris, France: pers. comm.).
Oecanthus bilineatus: Verified against paratype specimens (Forest Research Institute, Dehra Dun, India) originally designated by L. Chopard. Courtesy: Sudhir Singh, FRI, Dehra Dun.
Oecanthus henryi: Type specimens could not be traced at the National Museum of Natural History, Colombo, Sri Lanka and are possibly missing. Location of any other designated paratypes is not specified in the original descriptions (Sandrasagara 1954, Chopard 1969). Specimens used in the study fit the species description (Chopard 1969).
Morphological characters.— Multivariate analysis was performed separately for qualitative and quantitative morphological characters. For multivariate analysis using qualitative characters, the states of each character were given integer codes (0, 1, 2, 3....n) depending on the number of character states. These values were not standardized since all characters were coded as integers. A dissimilarity matrix was then calculated from these data for use in further analysis. For quantitative morphological characters, the values of each of the 42 characters were standardized by subtracting the mean value and dividing by the standard deviation for each character (Manly 1986). A Euclidean distance matrix was then calculated from these standardized variables for use in further analysis.
Song characters.— The song characters used in the multivariate analyses included the mean fundamental frequency, mean call duration, mean call repetition rate and mean syllable repetition rate (means refer to the mean values for an individual). Each character was subjected to linear regression analysis to check for any significant effect of temperature. This was carried out for 3 of the 4 species (excluding O. bilineatus). The values of all song characters that showed significant temperature effects were regressed to 22° C (the temperature at which most of the song recordings of O. bilineatus were obtained) and then used for the multivariate analysis. Song characters were analysed in the same way as quantitative morphological characters for calculating the Euclidean distance matrix.
Each of the 3 sets of characters, namely morphological qualitative, morphological quantitative and song, was subjected to analyses using 2 methods: hierarchical clustering (UPGMA: Sneath & Sokal 1973) and an ordination technique (non-metric multi-dimensional scaling, Manly 1986). Statistical analyses were carried out using Statistica (1999, Statsoft Inc., USA) software.
Analysis was carried out in 4 steps for each data set:
1) Clustering and ordination (2-D MDS) to obtain the graphical representation of the distances between individuals in terms of phenetic similarity.
2) Evaluation of the fidelity of the clustering and ordination algorithms in representing the original distance matrix, using the method of cophenetic correlations (Sneath & Sokal 1973).
3) Evaluation of the efficacy of the clustering and ordination algorithms in delimiting species, using internal allocation.
4) Evaluation of the validity of the clusters and ordinations in terms of identifying new specimens correctly (external allocation).
Cophenetic correlations: The minimum distance between all pairs of individuals in the cluster or ordination was calculated to generate a cophenetic matrix. A Pearson correlation coefficient (rcs) was calculated between the values of the original distance matrix and the corresponding values for the cophenetic matrix, providing a measure of the similarity between the cluster or ordination and the original distance matrix. The similarity between 2 clusters or ordinations (rc1c2) was also evaluated using the Pearson correlation coefficient between the cophenetic matrices derived from the 2 clusters or ordinations (Sneath & Sokal 1973).
Internal allocation of each of the 40 individuals to one of the 4 species was carried out in the following manner: the centroid, defined as the individual possessing the mean value of all measured characters (or modal value in the case of qualitative morphological characters), was specified for each species in the original data matrix, and included in the clustering or ordination analysis. The centroid of each species was thus assigned a particular point in space in the resultant cluster or ordination. The Euclidean distance of each individual was then calculated to the centroids of each of the 4 species in the cluster or ordination, and the individual was assigned to the species to whose centroid its distance was minimum. Since the species identity of each of the individuals was known beforehand, it was possible to evaluate the accuracy of internal allocation for each of the 4 species (as the number correctly assigned out of 10).
External allocation was carried out using new specimens (or songs) that had not been used to construct the clusters or ordinations. Allocation was carried out on one specimen at a time in the following manner: the specimen to be allocated was included in the distance matrix (but the values of its characters were not used to calculate the centroid of its species) and the cluster or ordination analysis was re-run to include the new specimen. The distance of this specimen from the centroids of each of the 4 species was then calculated as described above, and the specimen allocated to the species to whose centroid its distance was the minimum. Again, since the species identity of the new specimen was known, it was possible to evaluate the accuracy of external allocation for each species. External allocation was carried out for all species except O. bilineatus, for which the sample size was too small. The entire analysis, including internal and external allocation, was carried out exclusively on male specimens.
External morphological features of the 4 Oecanthus species
In this section, we provide a detailed and exhaustive description of the morphology of O. henryi, O. indicus, O. rufescens and O. bilineatus, extending previous observations (Chopard 1969, Otte & Alexander 1983). The major distinguishing features (from previous literature) on the basis of which the 4 species were initially classified, are briefly summarised. Oecanthus henryi and Oecanthus bilineatus may be distinguished from the other 2 species in that they possess a black spot on the inner face of both the first and second antennal segments (Chopard 1969; Fig. 1, Table 1). The spots of O. bilineatus were, however, distinct from those of O. henryi in being surrounded by a white rim (Fig.1). In addition, O. bilineatus may be distinguished from O. henryi because the former possess a mid-dorsal white stripe (flanked by fine black lines) on the head and pronotum (Chopard 1969). Oecanthus bilineatus is also distinct from all the other species in that males have a pair of closely spaced black spots on the elytra near the anal knot (Chopard 1969). Oecanthus henryi does not possess a mid-dorsal stripe on the head or thorax, but may be distinguished by the presence of a black spot at the base of each of the 6 tibiae (Chopard 1969). Oecanthus rufescens can be distinguished from Oecanthus indicus by the row of mid-dorsal spots along the length of the abdomen (Otte & Alexander 1983; Fig. 1, Table 1).
Body color did not always provide a reliable clue to species identity: specimens of O. henryi were always light green, whereas those of O. rufescens were always brown. Both O. bilineatus and O. indicus did, however, occur in both green and brown forms. The states or values of important distinguishing morphological features of the 4 species are shown in Table 1. O. rufescens and O. indicus were larger in size than O. bilineatus and O. henryi (Table 1). There were no significant differences between the sexes in body and elytral length in any of the species other than O. indicus, in which males were significantly larger than females (Table 1).
As is typical in singing species of gryllids, however, male elytra showed a number of specialisations absent in the female: the occurrence of specialised resonating structures such as the harp and mirror on the dorsal field, and the stridulatory structures, including the plectrum and file (Fig. 2). Except for size differences, the overall structure of the elytra was remarkably similar between males of the 4 species (Fig. 2). O. rufescens had a significantly higher elytral width and mirror area than O. indicus, and O. bilineatus had a significantly higher elytral width and mirror area than O. henryi (Table 1). The length of the stridulatory file was not significantly different between O. henryi and O. bilineatus, whereas that of O. indicus was significantly higher than O. bilineatus and O. henryi (Table 1). The mean number of pegs on the stridulatory file was not significantly different between O. henryi and O. bilineatus (Table 1). The mean number of stridulatory pegs in O. indicus was, however, significantly lower than in O. rufescens (Table 1). The morphology of the pegs was examined using scanning electron microscopy (Fig. 3): peg structure was essentially identical in all the 4 species (data shown only for O. rufescens), with individual pegs oriented at right angles to the file. The ventral surfaces of the pegs were rippled (Fig. 3c) and the pegs evenly spaced along the length of the file. The ultrastructure of the pegs in Oecanthus is very distinctive from that of field crickets of the sub-family Gryllinae (Walker & Carlysle 1975).
The metanotal or Hancock's gland, found only in male tree crickets, is also an important taxonomic character (Walker & Gurney 1967, Chopard 1969). Examination of the structure of the metanotal glands revealed clear distinguishing features between the 4 species (Fig. 4). The absence of the posterior median lobe and glandular pit of the scutum (terminology from Walker & Gurney 1967, Chopard 1969) distinguished O. bilineatus from the other 3 species. Instead, O. bilineatus possessed 2 sets of prominent setae that projected backwards from the posterior margin of the scutum. In dorsal view, the shape of the tubercle of the posterior median lobe of the scutum was characteristic in each of the other 3 species, being mushroom-like in O. henryi, bud-shaped with a ring of setae in O. rufescens, and flattened and dumb-bell shaped in O. indicus (Fig. 4). With respect to quantitative characters, O. bilineatus had a significantly larger scutum, but shorter scutellum, than O. henryi (Table 1); O. rufescens could be distinguished from O. indicus only on the basis of scutal length (Table 1).
Structure of the calling songs of the 4 Oecanthus species
All 4 Oecanthus species were found to be sympatric in areas of natural vegetation in and around Bangalore. Adults of O. henryi, O. indicus and O. rufescens were most abundant during the period from October to February, whereas O. bilineatus was most abundant between June and September. The preferred microhabitats were also different: O. bilineatus preferred higher calling sites in trees, whereas O. rufescens was usually found in dry, grassy areas. O. henryi and O. indicus were found on bushes, with individual males often singing at the same time on adjacent bushes. The peak period of calling activity for all the species was from 7 PM to 10 PM.
The calling songs of O. henryi were the most regular, with short chirps repeated once or twice per second, each chirp being composed of 11 to 17 syllables (Fig. 5). The songs of O. indicus were highly variable in call length, ranging from short chirps to long, irregular trills (even within individual singing bouts). Oecanthus bilineatus and Oecanthus rufescens both produced regular trills: whereas the former were usually 1 to 2 s in length, those of O. rufescens often continued uninterrupted for 10 to 50 s (Fig. 5).
Since temperature is known to affect several of the features of tree cricket calling songs (Walker 1962a), its effect on 4 song features, namely syllable repetition rate, call (chirp or trill) repetition rate, call duration and carrier frequency was examined using a linear regression analysis for 3 of the 4 species. Oecanthus bilineatus could not be analysed since singing males only occurred during the monsoon months when the variation in ambient temperature was low. Analysis of the song structure of the 3 species over the temperature range 17 to 29° C revealed several interesting features (Fig. 6). O. rufescens had the longest call durations, ranging from 1 to 50 s, whereas O. henryi had the shortest calls (0.18 to 0.34 s). Oecanthus indicus had calls that were, on average, longer (0.33 to 1.34 s) than those of O. henryi. As expected from the long call durations, the call repetition rate of O. rufescens was the lowest (0.02 to 0.55 Hz). Interestingly, both O. henryi and O. indicus had similar call repetition rates (1 to 2 Hz), despite the fact that the calls of O. indicus were on average longer than those of O. henryi. Call duration was not correlated with temperature in O. rufescens and O. indicus (p = 0.37 and 0.22 respectively), whereas call duration in O. henryi showed a significant decrease with increase in temperature (p = 0.002). In O. rufescens, call repetition rate was also not correlated with temperature (p = 0.87), whereas this feature showed a linear increase with temperature at a rate of about 0.1 Hz per degree Celsius in both O. henryi and O. indicus (p < 0.001 in both cases).
O. indicus showed the lowest syllable repetition rate (35 to 54 Hz); that of O. rufescens ranged from 38 to 65 Hz, whereas O. henryi had a slightly higher syllable repetition rate (44 to 68 Hz), overlapping with that of O. rufescens. Over the temperature range 22 to 25°C, O. bilineatus had the highest syllable repetition rate of the 4 species (mean = 67.5 Hz). As expected, syllable repetition rate showed a relatively steep linear increase with temperature (1.5 to 2.0 Hz per °C), in all 3 species examined (p < 0.0001 in all cases).
The carrier frequencies of O. rufescens (2.9 to 4.0 kHz) and O. bilineatus (3.7 kHz at 22°C) were, on average, about 1 kHz higher than those of O. henryi (2.4 to 3.3 kHz) and O. indicus (2.2 to 2.8 kHz). The carrier frequencies of the 3 species examined showed a significant linear increase with temperature (P < 0.0001 in all cases), though the rates of increase were slightly different for the three species (Fig. 6).
The calling songs of O. henryi and O. indicus showed overlap in the values of 3 of the 4 characters analysed: similar carrier frequencies and almost identical call repetition rates, and some overlap in call duration. The only feature with a nonoverlapping distribution of values was the syllable repetition rate (Fig. 6).
In order to examine the utility of a numerical taxonomic approach to defining species boundaries and in identification, we subjected both the morphological and song characters of all the species to analysis using 2 multivariate statistical methods: UPGMA clustering and multidimensional scaling (MDS). The qualitative and quantitative morphological characters were analysed separately and compared with each other and with song characters, in order to determine the kinds of characters that would be most effective in delineating the species boundaries.
Types of characters.—In the first round of analysis, we used the maximum number of characters measured for each type: 42 quantitative morphological, 12 qualitative morphological and 4 song characters. The results of the clustering and ordination of these 3 data sets are shown in Fig. 7. All 3 sets of characters resulted in 4 discrete clusters corresponding to the 4 known species when subjected to either UPGMA clustering or multidimensional scaling (Fig. 7). The fidelity of the clustering or ordination algorithm in its representation of the original distance matrix was evaluated using the cophenetic correlation coefficient (rcs). Both clustering and ordination showed very high fidelity of representation (0.98 to 0.99) in the case of qualitative morphological and song characters, but multi-dimensional scaling gave a far superior representation in the case of quantitative morphological characters (Table 2).
In the case of 42 quantitative morphological characters (Fig. 7A,B), internal allocation of each of the 40 individuals (Table 2) resulted in 100% (10 out of 10) correct species allocation of O. bilineatus and O. indicus, and 90% (9 out of 10) correct allocation of O. henryi with both clustering and MDS. O. rufescens was, however, 100% correctly allocated using MDS, whereas there was one misallocation using the clustering algorithm. External allocation of 10 new O. indicus and O. henryi males, and 9 O. rufescens, resulted in 100% correct allocation of O. indicus and O. henryi using MDS (Table 2). External allocation using clustering was less successful, with only 40 to 50% correct allocation of O. indicus and O. rufescens, and 80% correct allocation of O. henryi.
In the case of the 12 qualitative morphological characters (Fig. 7C,D), internal allocation of each of the 40 individuals resulted in 100% correct allocation (10 out of 10) for all 4 species, using both the clustering and the MDS methods (Table 2). In external allocation also, there was100% correct species allocation of the 3 species examined, using both clustering and ordination.
In the case of the clusters and ordinations obtained using 4 song characters (Fig. 7 E,F)), there was again 100% correct internal allocation for each of the 4 species with both methods. The songs of 10 males (of each of the 3 species O. indicus, O. henryi and O. rufescens) that had not been used to construct the species clusters were employed, one at a time, for external allocation of species identity. The clustering algorithm gave 100% correct allocation (10 out of 10) in the case of O. henryi and O. indicus, and 90% correct allocation (9 out of 10) in the case of O. rufescens (Table 2). The MDS technique yielded 100% correct allocation (10 out of 10) in the case of O. henryi and O. rufescens, and 90% correct allocation of O. indicus.
In summary, both the clustering and ordination techniques were 90 to 100% successful in achieving correct species allocation (with new specimens) in the case of song and qualitative characters. In the case of quantitative morphological characters, however, the MDS was superior and gave close to 100% correct allocation of new specimens, whereas the clustering algorithm performed poorly for 2 of the 3 species in external allocation (Table 2).
Number of characters.—In the analyses described above, the number of characters in the 3 sets to be compared (quantitative morphological, qualitative morphological and song) were unequal, even by an order of magnitude. In order to examine more closely the effect of the number of characters on the efficacy of species grouping and allocation, in the next set of analyses we varied the number of quantitative and qualitative morphological characters used.
Quantitative characters.— Clustering and MDS analyses were carried out using 26, 12 and 4, randomly picked, quantitative morphological characters, from the total set of 42. This was repeated 10 times for each of the sets. The results (in the form of one exemplar from each set) are illustrated graphically in Fig. 8. A visual inspection suggested that the goodness of the clusters in both algorithms deteriorated with a decrease in the number of characters used.
To examine this more quantitatively, we calculated 1) the cophenetic correlation coefficient rcs between the original distance matrix and the representation as a result of clustering or ordination and 2) the cophenetic correlation coefficient between matrices representing 2 clusters or ordinations, rc1c2 (Sneath & Sokal 1973), of which one (the reference) was always the cluster or ordination that resulted from the analysis of 42 quantitative morphological characters. The mean value of rcs and rc1c2 (based on 10 runs of clustering and ordination) was calculated in each case of 26, 12 and 4 quantitative morphological characters.
The results, summarised in Table 3, showed that the average value of both rcs and rc1c2 increased with the number of randomly picked characters in both clustering and ordination. In other words, the fidelity with which the clustering or ordination represented the original distance matrix increased with the number of characters. The MDS technique was, however, consistently superior to the cluster analysis in its fidelity to the original distance matrix (Table 3). The clusters or ordinations also became progressively more similar to the pattern produced by 42 characters, with rc1c2 increasing from an average of 0.64 for 4 characters to 0.95 for 26 characters, in the case of clustering, and from 0.75 to 0.97, in the case of MDS (Table 3). In addition, the variation in both rcs and rc1c2 decreased (shown as the decrease in the range of values) with increasing numbers of characters: with 26 quantitative characters, all 10 runs produced rcs values greater than 0.8 [the value proposed by Sneath & Sokal (1973) as the minimum for a good representation] and rc1c2 values between 0.9 and 0.98.
Internal allocation was carried out, as described before, on each of the 40 individuals of the 4 species for each of the sets of 10 runs. The results are summarised in Table 4, showing the mean number of correct assignments (out of 10) for each species, with decreasing numbers of characters: interestingly, 26 characters were as effective as 42, with a very small range of values of the number correctly assigned in the 10 runs of randomly picked variables. Whereas the mean number of correct assignments still remained fairly high (8 to 9.5) even with only 4 characters, the range of values became large (from 2 to 10 for O. henryi or 5 to 10 for O. rufescens). The trends were similar regardless of the method: both clustering and MDS gave similar results in this case.
Thus, the identity of the characters picked probably became more crucial as the number of characters for the analysis was decreased. To test this, we repeated the above analysis with a non-random set of 12 variables that happened to give 100% correct assignments. This set consisted almost exclusively of characters of the metanotal gland, stridulatory structures and tympana. Four variables were randomly picked from the above set 10 times and used for clustering and ordination. The resulting average cophenetic correlation coefficients rcs and rc1c2 were both higher than in the case of 4 variables picked randomly from the entire set of 42 (Table 3). The average number of correct allocations (Table 4) was also somewhat higher than in the case of 4 randomly picked variables, though the differences appeared small.
In the next analysis, we examined whether the effectiveness of using a larger number of characters was explained by the fact that it increased the probability of choosing crucial diagnostic characters. In order to test this, we dropped several of the diagnostic characters of crickets that are typically species-specific, including those of the metanotal gland, wing and stridulatory structures, and measurements of tympana, many of which were used in classical taxonomy (Chopard 1969) and which were part of the above set of 12 characters that could unambiguously distinguish the 4 species. We then had a set of 25 general quantitative characters which we used to perform clustering and ordination. Interestingly, the clusters and ordination produced by this set had high cophenetic correlation coefficients (clustering: rcs and rc1c2 = 0.83 and 0.94 respectively; MDS: rcs and rc1c2 = 0.95 and 0.94) and showed 80 to 100% correct internal allocation (Tables 3, 4).
Qualitative characters.— Four qualitative characters were randomly picked from the set of 12 and the above analysis repeated 5 times. Internal allocation (Table 4) showed that whereas there was 100% correct assignment for O. bilineatus, and 90 to 100% correct assignment of O. henryi using both methods, the assignments of O. indicus and O. rufescens were not as successful, with a high variability in the number correctly assigned, particularly with the clustering technique.
Utility of the numerical taxonomic approach to species classification and identification
Our results clearly demonstrate the practical utility of a numerical taxonomic approach to the classification and identification of species of the tree cricket genus Oecanthus from Southern India. Both UPGMA cluster analysis and multidimensional scaling grouped the 40 individuals examined into 4 discrete clusters that corresponded with the species groups based on classical taxonomy. Both cluster analysis and MDS were comparable in their high fidelity of representation of the distance matrix and 90 to 100% correct allocation of new specimens using song and qualitative morphological characters. With quantitative morphological characters however, MDS was undoubtedly the better method, both due to its higher fidelity of representation of the data and the much higher percentage of correct species allocation of new specimens, and we advocate its use over clustering methods.
Clustering techniques inherently impose hierarchical structures, whether or not they really exist, and may therefore not provide the best representation of the data (Sneath & Sokal 1973). Further, different clustering algorithms may result in different cluster topologies for the same data set, thus rendering phenetic clustering an unreliable method for the classification of higher level taxa (Ridley 1986). Multidimensional scaling does not impose a hierarchical structure and is thus better suited to the problem of delimiting species and allocating individuals to species based on these delimitations. The current study shows, however, that both phenetic clustering and ordination techniques are powerful in delineating species boundaries that are concordant with those based on morphology using classical taxonomic methods.
Numerical taxonomic methods are currently used largely in microbial classification (Sneath 1995), where phylogenetic trees and species boundaries may be difficult to infer due to the frequent occurrence of reticulate transfer of genetic material. These techniques are also used to delimit species in some plant taxa that are characterised by extensive interspecific hybridization, where a cladistic approach could be problematic (McDade 1992). Numerical taxonomic methods were used by Blackith & Blackith (1968) in an attempt to provide a quantitative framework for the classification and ranking of higher order taxa of orthopteroid insects (including phasmids, dictyopterans and dermapterans). Otte (1994) applied cluster analysis to morphometric data on male genitalic structures to define the species groups of Hawaiian crickets of the genus Laupala (Subfamily: Trigoniidinae). To our knowledge, numerical taxonomic methods have not so far been applied to the problem of species delimitation and identification in gryllid taxa. Although this approach has been successful with 4 species of the genus Oecanthus, we now aim to test it on a much larger taxon: the subfamily Gryllinae, comprising the field crickets, of which there are currently 130 reported species from the Indian subcontinent (Chopard 1969).
Song and morphological characters as tools for the delineation of species boundaries
The traditional classification of most taxa, including gryllids, is based largely on morphological characters, both qualitative and quantitative. Chopard (1968) has provided the most extensive classification of gryllids worldwide, based on morphological characters (largely) of museum specimens. The fact that the Oecanthus species delimited by us, using more quantitative methods and song characters unavailable to him at the time, are concordant with the species that he defined, bear testimony to the rigor and intuition of the classical taxonomist. In this context, we have maintained the nomenclature of Chopard (1969) in our analysis, even though one of the 4 species of Oecanthus described therein, O. bilineatus, bears some (but not all) characters in common with the new genus Viphyus of African tree crickets (Toms & Otte 1988).
The demonstration of the role of calling songs in mediating premating isolation between cricket species (Walker 1957) led to their use as reliable characters for the taxonomic identification and classification of species (Walker 1962b, Otte & Alexander 1983). The question of whether calling song structure accurately reflects distinct breeding populations was explicitly tested by Shaw (1999) for the Hawaiian cricket species of the Genus Laupala (Subfamily: Trigoniidinae) by examining the concordance between species boundaries implied by mitochondrial DNA haplotypes and those designated on the basis of calling song structure by Otte (1994). For 3 out of 4 sympatric, congeneric communities examined in her study, there was concordance between the species boundaries delineated by the 2 types of characters.
The classification and identification of tree cricket species as practised by most taxonomists today typically uses a combination of morphological and song characters, together with information on range and distribution (Walker 1962b, 1963; Toms & Otte 1988; Otte 1994). Morphological characters that are peculiar to stridulating species, such as the number of pegs and length of the stridulatory file, have proved to be good characters to delimit species (Walker 1962b). Walker & Gurney (1967) also demonstrated the utility of the metanotal gland in species identification of tree crickets. Genitalia usually provide excellent characters for cricket species identification (Chopard 1969), but this is not the case in tree crickets, where it has been difficult to find characters that distinguish congeneric species (Toms & Otte 1988). In our analysis, we have incorporated all of the above characters (except information on range and habitat) and tested the utility of song and morphological characters separately. Genitalic characters were not used, largely because we intend to develop a classification scheme that may be used by nontaxonomists, and the dissection and analysis of insect genitalia requires special skills and knowledge.
Our study shows that, using numerical taxonomic methods, both song and external morphological characters can provide high levels of accuracy in species classification and identification. Song characters were very powerful in correctly delimiting species: 4 song characters could resolve the 4 species with 100% accuracy. In the allocation of new specimens also, song characters correctly allocated all specimens in 2 of the 3 species examined, and misallocated just 1 specimen in the 3rd species. Qualitative morphological characters were superior to quantitative ones for species delimitation in this group of crickets. Quantitative morphological characters, on their own, also successfully delineated species boundaries, provided a large number of characters were used (> 25 in this case). Interestingly, species boundaries were correctly delineated even when all of the diagnostic characters (including stridulatory structures, tympana and metanotal gland) were excluded from the analysis, provided that a sufficient number of characters were included.
Concordance in species boundaries derived from song and morphological characters
In our analysis, clusters derived from morphological characters have been used to represent the phenetic (morphological) species concept, whereas those derived from song characters have been treated as indicative of species as defined by the biological species concept (reproductive isolation: Mayr 1942). For crickets, the latter assumption is probably justified, since the species-specific calling songs are reliable indicators of species identity (Cade 1985, Alexander et al. 1997) and different song structures may be used as legitimate substitutes for actual tests of reproductive isolation between species, particularly those in sympatry (Shaw 1999). Our results demonstrate the concordance in species boundaries provided by song and morphological characters in the genus Oecanthus, implying that the phenetic clusters based on morphology correctly reflect the species boundaries defined by reproductive isolation in the tree cricket species investigated.
Implications for the development of Web-based species identification keys for gryllids
Our results with tree crickets provide one possible method, multivariate analysis, for accurate species-level identification, using different sets of characters. Parallel identification keys could be developed using qualitative morphological characters, quantitative morphological characters and song. This opens up the possibility to identify species based on song recordings alone, field examination of qualitative characters or a thorough examination of quantitative morphological characters. Which option is exercised could depend on whether sampling is required to be noninvasive (often the case in biodiversity surveys in protected areas) and on the time and facilities at hand. In the case of the Oecanthus species studied, song and qualitative morphological characters were slightly superior to quantitative morphological characters in terms of accuracy of allocation, which could be another factor influencing the decision on which characters to use to identify species. We prefer numerical taxonomic methods for species-level identification since they use a large number of characters in a consistent, systematic and quantitative manner and these can be easily adapted in the construction of taxonomic keys. These methods need, however, to be tested with more species-rich genera and different geographic variants of species, in order to confirm their general utility for gryllid species identification.
We believe that quantitative information on songs and morphology, including interindividual variation, if available on the Internet, would be more valuable for species-level identification than the qualitative information provided by images of type specimens and oscillograms (the latter being important supplements). Web-based taxonomic keys would also solve the problem of having to identify species in the absence of access to type specimens, since the onus of type verification (for the sake of nomenclature) would be on those who develop the databases and keys rather than on the users. Web-based taxonomic keys using numerical taxonomic methods could provide an objective and quantitative method of performing, and evaluating the probability of, correct species identification.
We are grateful to a number of people who helped with song recording, specimen collection and song analysis. Divya, B. U. and Savita Swamy obtained most of the recordings and specimens; Natasha Mhatre, Vivek Nityananda and Swati Diwakar contributed a lot to the song analysis; Geeta Nayak helped with data acquisition and organisation. Sayantan Biswas initiated this study, carried out some of the preliminary work and gave detailed comments on the manuscript. Many thanks to Saravanakumar for taking the photographs in Fig. 1 and to Maneesha Inamdar of JNCASR for the use of her optical microscope. We also thank the Solid State Chemistry Unit of the Indian Institute of Science for the use of both their optical microscope and their scanning electron microscope. This project was funded by the Department of Science and Technology, Government of India (Project No. SP/SO/C-50/98).
Distinctive morphological characters of the 4 Oecanthus species.
Internal and external allocation of individuals after clustering and ordination using different types of characters.
Cophenetic correlations in relation to the number of quantitative morphological characters used in the analysis.
Internal allocation of individuals after clustering and ordination using different numbers of morphological characters.