John Van Sickle, Charles P. Hawkins, David P. Larsen, Alan T. Herlihy
Journal of the North American Benthological Society 24 (1), 178-191, (1 March 2005) https://doi.org/10.1899/0887-3593(2005)024<0178:ANMFTE>2.0.CO;2
KEYWORDS: predictive model, precision, O/E, RIVPACS, bioassessment
Predictive models such as River InVertebrate Prediction And Classification System (RIVPACS) and AUStralian RIVer Assessment System (AUSRIVAS) model the natural variation across geographic regions in the occurrences of macroinvertebrate taxa in data from streams that are in reference condition, i.e., minimally altered by human-caused stress. The models predict the expected number of these taxa at any stream site, assuming that site also is in reference condition. A significant difference between the ratio of observed (O) and expected (E) taxa (O/E) and 1.0 indicates that the site is not in reference condition. The standard deviation (SD) of O/E values estimated for a set of reference sites is a measure of predictive-model precision, with a small SD indicating that the model accounts for much of the variability in E that is associated with natural factors such as stream size and elevation. We propose a null model for E that assumes fixed occurrence probabilities for individual taxa across reference sites. The null model explains none of the variability in E caused by natural factors, so the SD of its O/E predictions is the upper limit attainable by any predictive model. We also derive a theoretical lower limit for SD of O/E that is caused only by replicate-sampling variation among predictions from a perfect model. Together, the null-model and replicate-sampling SDs estimate the minimum and maximum precision, respectively, attainable by any predictive model for a given set of reference-site data. A predictive model built from data at 86 reference sites in the Mid-Atlantic Highlands region, USA, had SD = 0.18 for O/E across those sites, while the corresponding null model had SD = 0.20, indicating relatively little gain from the predictive-model effort. In contrast, a model built from 209 sites in North Carolina, USA, had predictive- and null-model SDs of 0.13 and 0.28, respectively, indicating that the North Carolina predictive model had relatively high gain in precision over the null model. Replicate-sampling SDs of O/E for the Mid-Atlantic and North Carolina data were 0.09 and 0.11, respectively, suggesting that the North Carolina predictive model had little room for further improvement, in contrast to the Mid-Atlantic model. The precisions of null-model estimates were lower than those of predictive models, so null models somewhat underestimated the percentages of 447 and 1773 test assemblages from the Mid-Atlantic region and North Carolina, respectively, that differed significantly from reference conditions. The estimates illustrate how a simple and easily built null model provides a lower bound for the prevalence of impaired streams within a region.