Version 2 of the Orthoptera Species File Online is now available on the Internet. The database tables are designed to provide improved reliability of the data contained. Other improvements include availability of more ranks in the hierarchy, more flexibility in the user's selection of information to be displayed, new search options, better conformity with the International Rules of Zoological Nomenclature, and automation of many editing functions. Current data cover Ensifera; data for Caelifera will be added. Orthopterists are invited to participate by editing data for groups in which they are working.
Authors of scientific articles have always had concerns about missing references that are important for the work they are doing. Some areas of science change so rapidly that anything written more than a few years ago is unlikely to have any material impact. Taxonomists are not so fortunate. Articles written in obscure journals one or two centuries ago can affect the nomenclatural decisions they must make. Thus literature searches can be time consuming and frustrating. Version 1 of the Orthoptera Species File Online was designed to provide reference data useful to taxonomists and to all others who must correctly refer to orthopteran taxa. Version 2 attempts to improve the reliability of the data, to add additional details useful to research taxonomists, and to improve the “user friendliness” for both the casual user and the user who adds new data or modifies the existing data.
Some authors have provided important works to address the problem of finding all references related to a taxonomic study. The three volume set, “A Synonymic Catalogue of Orthoptera” (Kirby 1906–1910), a total of 1,765 pages, was a massive early contribution. Johnston (1956, 1968) provided a major reference for African grasshoppers. Roonwal's (1961) huge bibliography included 7,040 references on Acrididae (using an older, broad definition of the family). Many authors contributed to Orthopterorum Catalogus. Some unpublished works are also worthy of mention. They include a checklist by Hebard with further annotation by Hubbell and a computer file by Carbonell. There are probably many other such works unknown to me. The eight volume series “Orthoptera Species File” by Daniel Otte (1994–2000) provides a massive update and extension. The work was done in a database and electronically converted to a form suitable for publication in book form. Using modifications of the database, Otte and Piotr Naskrecki created Version 1 of the Orthoptera Species File Online containing updated information from the first seven volumes of the book series plus many photographs and sound recordings. It can be accessed over the Internet at http://viceroy.eeb.uconn.edu/Orthoptera. The information on Tettigonioidea is also available on a compact disk. (See Naskrecki & Otte 1999.)
The present paper offers information about Version 2 of the Orthoptera Species File Online, which can be accessed by selecting Tettigonioidea or Gryllacridoidea from the Internet URL mentioned above. In August of 1999 I submitted a proposal to the Orthopterists' Society for the development of a new version with additional features. A major objective was to make it easier for many orthopterists to participate in updating the database over the Internet. It is hoped that the new database will be good enough that others will spend time updating the database instead of maintaining separate versions with similar data but smaller scope to fit their specific interests. The Board of the Orthopterists' Society endorsed the concept and established a committee consisting of Theodore Cohn, Piotr Naskrecki, Daniel Otte, and myself to provide advice regarding the project. By late 2000 the database had progressed enough to demonstrate its potential. The Board of the Orthopterists' Society made the committee a permanent standing committee with the additional responsibility to supervise an endowment with annual grants to aid work related to the database. That endowment now stands at approximately $200,000 with annual grants of $10,000. Information about the database was presented at the meeting of the Orthopterists' Society in Montpellier, France in August 2001. Version 2 of the Orthoptera Species File Online now covers all Ensifera. Orthopterists working on Ensifera are invited to participate in updating the database. (See “Invitation to participate” on page 156.)
Providing Internet access to a database with the desired flexibility and ease of use has required a patchwork of software. The biggest problem was selecting the database software, but minor differences in computer languages and editors also caused considerable frustration. Version 1 of the Orthoptera Species File Online uses Filemaker Pro. I was unable to program the functions I wanted in that database, and initially used Microsoft Access. That also lacked the programming flexibility I wanted. The database now uses Microsoft SQL Server, but the Access interface is still useful for administrative and development functions done on the local area network at my offices. The most elaborate programming is done in stored procedures using the Transact-SQL language provided as part of SQL Server. This allows excellent ability to program complex functions, but limitations of the editor and the false compile error messages are often quite annoying. The Internet pages were developed using Microsoft Visual InterDev. They are nearly all Active Server Pages (ASP) that use Visual Basic Script for the portion executed by the server. The flexible displays are accomplished by using the stored procedures to generate pieces of hypertext markup language (HTML), which are passed to the Visual Basic code for incorporation with other HTML written in the ASP source file. The source files also include blocks of Java Script that are passed over the Internet for execution by browsers. The Java Script animates the flyout menus and performs initial screening of user input.
Design of the database tables is a crucial difference between Version 1 and Version 2 of the Orthoptera Species File Online. Version 1 tables are divided by taxonomic level, with separate tables for families, subfamilies, tribes, genera and species. Subtribe, subgenus and subspecies names are merged into fields designed primarily for other purposes. In Version 2, all taxa, regardless of rank, are in the same table. Fifteen ranks are currently recognized from order down to subspecies, but additional ranks could be added in less than a minute. The row of data for each taxon includes a number to identify the rank and another number to identify the taxon at the next rank above it. In Version 1, citations are placed in a single field. In Version 2, the literature is divided into separate tables for authors, publications, references and citation data. If 100 taxa include citations to the same reference, that requires 100 copies of the reference in Version 1, but only one copy in Version 2. In Version 2, only the specific page citation and note require 100 separate copies. Version 2 includes the 21 tables as depicted in Fig. 1 plus four additional internal tables used for administrative and development purposes. The lines running between tables in the figure show the links between tables. The small key images point to unique identifying numbers. Much more detail about the design may be seen over the Internet by selecting “Database Design” from the pull down menu associated with “Home” at the top of each page.
The value of the complex data structure shown in Fig. 1 can be illustrated by an example. The central table is tblTaxon in the middle of the figure. If we start with a genus in this table, we can find the corresponding row in tblTypeSpecies where GenusID is equal to ID in tblTaxon for the genus. In the same row in tblTypeSpecies, we find the value of TypeSpeciesID. We can find the row in tblTaxon where ID is equal to that value for TypeSpeciesID. The value of AboveID for the species in tblTaxon enables us to find the genus that contains the species. (If the genus contains subgenera or other intermediate ranks, the value of AboveID must be used again to move up until the appropriate rank is reached.) This process, executed automatically by the program, revealed eleven cases in Tettigonioidea where the type species was not contained within the genus. Literature searches were required to reconcile the discrepancies.
Additional ranks in hierarchy.— The prior section mentioned the handling of taxa at all ranks in a single table. This allows presentation and editing of intermediate ranks in the same manner as for the more commonly used ranks. Fig. 2 shows a sample display of the hierarchy for Ceuthophilus (Geotettix) with the little used ranks of species series and species group. The improved table structure also means less complexity for the user to learn and less programming work for developers.
Flexibility in information displayed.— Version 2 allows great flexibility in what information the user chooses to see. My personal preference is to see multiple taxa displayed in outline form on the same page. The hierarchy display shows the three levels above the current taxon and one, two or three levels below the current taxon. Fig. 3 is the same as Fig. 2 except that the user moved the mouse cursor over the option bar at the left in Fig. 2. This caused the menu at the left in Fig. 3 to display the choices. Green dots at the left show selections currently in effect, and red dots show selections that are turned off. The user can specify any combination of displaying or hiding synonyms, citations, images, sound recordings, type genus data, type species data, and type specimen data. Some items have abbreviated and long form versions. For instance, citations usually include abbreviated journal names and no titles for journal articles. The user who has specified long form displays will see the full journal name and the article title (in those cases where these data are in the database).
Fig. 4 shows an example of the display for a specific taxon. Whereas the default hierarchy display does not provide the optional information, the default taxon display shows nearly all information. The menu shown in Fig. 5 provides choices similar to those shown in Fig. 3. If the user selects “Long form display,” the display in Fig. 5 will be enhanced to show the article titles, the full journal names, and “U.S. National Museum of Natural History, Washington DC, USA” instead of only “Washington.”
Search choices.— By clicking on “Search” at the top of the screen, the user may initiate a variety of searches. A menu can be pulled out from the left side of the screen to select the type of search. It is possible to show references by a particular author or in a particular publication. Users may search for a word or phrase contained in type locality data. When searching for a taxon, the search may be restricted to a specific rank or to any rank. The search may be confined within any specific taxon, such as within a particular tribe. Synomyms may be included or excluded from the search. User specifications are remembered for subsequent searches.
Conformity to International Rules.— The database provides improved conformity with the International Rules of Zoological Nomenclature. Version 1 generally shows the author of a family group name as the author who first used the name in its current correct form. However, the International Rules specify that the author and date for priority must be based on the first use of any family group name based on the same genus. Version 2 is designed to accommodate the correct treatment. Another difference relates to the formatting when the actual publication date differs from the stated date. If the stated date on a publication is 1875, but it was not actually issued until 1876, Version 1 displays “1875.” Version 2 displays “1876” following the format recommended by the International Rules.
Version 2 contains more detailed information needed for nomenclatural decisions. For example, a species name may not be specified just as a homonym. It must be either a primary homonym or a secondary homonym. The difference is important because a secondary homonym becomes valid if it is later moved to a different genus, whereas a primary homonym does not. Misspellings and unjustified emendations are distinguished from other types of synonyms. The name of the author of a species described in a misspelled genus is not placed in parentheses. For each species in Version 2, there is a cross reference to the genus in which it was originally described.
Editing procedures.— Considerable time and effort have been spent to facilitate the editing process. Editing choices are shown in Fig. 6. Notice the added menu item “Edit” at the top of the page. Only users who have logged in and who have been granted editing privileges are able to see this display. Editing choices are provided in a manner that fits the taxonomic context, and the list of choices varies according to the rank and identity of the current taxon.
In many cases when a specific change has been entered, a variety of related changes occur without the need for further input by the editor. For example, suppose someone has published a revision that makes Taeniopoda a subgenus of Romalea. In order to change the status of Taeniopoda, the editor (the person, not a program) would select Taeniopoda as the current taxon and then select “Place under a different parent taxon” from the editing choices shown in Fig. 6. This would yield the display shown in Fig. 7. The editor must then select “genus” from the choices for rank, enter “Romalea” in the blank, and click on “Enter.” At this point the program takes over and goes though the following steps:
Verify that Romalea is a valid taxon at the rank of genus.
Determine that Romalea does not have any subgenera.
Create subgenus Romalea.
Transfer the literature citations for genus Romalea to subgenus Romalea.
Transfer the type species information for genus Romalea to subgenus Romalea.
Transfer all species in genus Romalea to subgenus Romalea.
Change genus Taeniopoda to subgenus Taeniopoda.
Place subgenus Taeniopoda under genus Romalea. (All species in Taeniopoda retain their position subordinate to Taeniopoda, but in its new position.)
Determine the original genus in which each species of Taeniopoda was described.
Based on the identity of the original genus, set Parens in tblTaxon for each species, to indicate whether or not parentheses should be used around the author and date of the species.
If the editor modifies the journal name in a citation for one taxon, the program will ask if this change applies A) to only this one citation, B) to all references to the same paper, or C) to all citations to the old journal name. This avoids the need for the editor to search for all the occurrences of the same error. If an editor reclassifies a genus name from synonym to misspelled, the program will find all species that were initially described under that genus name. If the species is now in the correctly spelled genus name, the program will remove the parentheses from the author and date associated with the species.
A person editing Version 2 is restricted to input that makes sense (except for note fields where any input is accepted). For instance, if the editor is adding a new valid taxon subordinate to a tribe, the available choices for rank of the new taxon are restricted to subtribe and genus. The most common choice (genus) is provided as the default, and the other choice is listed on a pull down menu. When the editor provides an author name, the program will compare the name not only with the list of known authors, but also a considerable list of misspelled author names. For example, the database contains six different versions of Brunner von Wattenwyl. If a name cannot be found, the editor has the choice of adding the name as a new author. When simple typing errors are suppressed in this way, the data become much more reliable. A user who obtains a list of all references by an author need not worry about missing references because an editor entered a different version of the author's name.
Administrative procedures.— A number of administrative procedures (Fig. 8) are available to those directly involved in developing and maintaining the database. Some of these choices make sense only to a person who understands the internal data structure. They test a number of data relationships to see if they fit the proper taxonomic and nomenclatural restraints. This helps to track down programming errors that permit improper relationships to occur. Also, there have been many cases where authors and publications have been entered multiple times with various spellings. There are procedures to eliminate the duplications. For instance, a publication and list of all its contained references can be placed in one column, and a second publication with its list of contained references can be placed in a second column. If the administrator is satisfied that both are really the same publication, a single click is sufficient to reclassify the offending spelling as a duplicate of the preferred choice. All references and all citations to the offending spelling are automatically corrected.
Importing data from an earlier database is tedious and time consuming because of errors in the earlier data and because multiple pieces of information are contained in single fields. Fig. 9 shows a form that has been developed to facilitate the process. “CiteString” near the top shows the data from one such field. The program parses the CiteString into its components. The automatic parsing is correct about ninety percent of the time, and the user needs only to click on “Enter, move to next record”. When a quick visual inspection indicates a problem, many buttons are provided to simplify entering the required corrections.
The immediate priority is to import data for the rest of Orthoptera. This is likely to take another year. Once that is completed, the database will be expanded to include other data. In view of the proliferation of other databases that deal with Orthoptera, incorporating links to those databases is the most likely next step. Other possibilities include character data, keys, distribution data, specimen data, ecological data, and common names. Our progress in Orthoptera already provides an excellent example for other taxonomic groups. The Orthopterists' Society can facilitate added development and carry this further.
Invitation to participate.—
Although many errors were found and corrected while bringing data into Version 2, they are probably only a fraction of the errors that still remain. Correcting the remaining errors and keeping the database current as more research is published will be a major undertaking. Please help! I hope others will participate by assuming responsibility for editing groups that are of particular interest to them. Editing can be done on any computer that is connected to the Internet. To obtain access for editing three steps are necessary.
Enter the database and log in: Place the cursor over “Home” at the top of any page. Click on “Login.” Enter your name and password.
Notify me that you would like access for editing. E-mailing to me at firstname.lastname@example.org is the easiest way to do this.
I will modify a hidden table to indicate that you have permission to edit.
Nearly all of the data has been transferred from files provided by Dan Otte or Piotr Naskrecki. Without their impressive accumulation of data and their full cooperation, I would never have attempted development of the database. Patricia Peek has entered a great deal of data and tracked down the explanations for many discrepancies. Wen Jing Dai provided the features that make using the database visually appealing and easier to use. He also assisted in programming other aspects. Jason Weintraub received the first grant from the database endowment fund of The Orthopterists' Society. He performed many of the literature checks needed to reconcile problems in the data received.