Fish density models are essential tools for fish ecologists and fisheries managers. However, applying these models can be difficult because of high levels of model complexity and the large number of parameters that must be estimated. We designed a simple fish density model and tested whether it could predict fish densities in lotic systems with meaningful levels of accuracy and precision. We built our 6-parameter model on 2 key assumptions: 1) fish population density is a power function of mean body mass (i.e., the self-thinning relationship), and 2) energetic resources are transferred from lower to higher trophic levels at a nearly constant rate (i.e., trophic transfer efficiency). We estimated the self-thinning and trophic transfer efficiency parameters by randomly sampling from values reported in the primary literature. Remaining parameters were net primary production, trophic level, the production∶biomass ratio, and mean body mass. We used empirical parameter estimates and fish density estimates to test the model in 4 warm-water and 4 cold-water systems. Model accuracy was high in 3 test systems (deviations between the model-predicted densities and empirically observed densities <30%), moderate in 3 test systems (deviations 75–111%), and low in 2 systems (deviations >150%). Model precision was low (e.g., the interquartile ranges of model-predicted densities encompassed ~1 order of magnitude), but appropriate for predicting fish densities at coarse spatial and temporal scales. We concluded that the model is a potentially useful and efficient tool, and we provide recommendations for applying the model. In particular, we emphasize that the model is scalable, and therefore, well-suited for estimating fish densities at large spatial scales. We also point out that the model is a carrying capacity model, and therefore, can be used to predict fish densities in undisturbed systems or to approximate reference conditions.
Freshwater fishes are valuable for many reasons. They regulate ecosystem structure and function through the processes of selective predation (Carpenter et al. 1985), nutrient cycling (Schindler 1992), and bioturbation (Gelwick et al. 1997). They are key indicators of ecosystem health and environmental disturbance (Karr 1981). Freshwater fishes are central to the spiritual identity of many native cultures (Swezey and Heizer 1977). They provide a primary or supplementary source of protein for many people, particularly in economically disadvantaged sectors (Macinko and Schumann 2007). Moreover, they are the driving force behind an enormous recreational industry. For example, in 2006, some 25 million US anglers participated in freshwater recreational fisheries, which generated ~31.2 billion USD in retail sales and 11.5 billion USD in tax revenues (SA 2007).
Because freshwater fishes are valuable, fisheries scientists must have reliable tools to estimate or predict their abundances. Field-based surveying techniques, such as mark–recapture or multiple-sample depletion, have long been the cornerstone of efforts to estimate fish abundance or density (reviewed in Schwarz and Seber 1999). When sample sizes are sufficiently large, these observational methods can estimate density with a high degree of accuracy and precision (Peterman 1990, Gibbs et al. 1998). However, field surveys are time and labor intensive (Al-Chokhachy et al. 2009), and cannot always be extrapolated beyond individual stream reaches or sampling events (Dauwalter et al. 2009). Thus, reliance on observational methods is inefficient when density estimates are needed at many sites or at regional scales (Lewis et al. 1996).
Models also can be used to estimate fish densities. Habitat and bioenergetics models are 2 of the most common types of fish density models. Habitat models use statistical correlations between fish abundance and physicochemical variables, which are usually documented through field-based sampling programs, to predict fish densities at independent (i.e., unsampled) locations (e.g., Fausch et al. 1988, Toepfer et al. 2000). Bioenergetics models use mass-balance equations to predict fish growth and survival at the individual level (Brandt and Hartman 1993). These models partition consumed energy into separate production, respiration, and excretion terms (see Gerking 1978). Density can then be estimated and tracked through time by summarizing at the cohort or population level.
Unfortunately, the utility of habitat and bioenergetics models is often constrained by high levels of model complexity (Ney 1993, Pace 2001). These models require large quantities of data for parameter estimation and calibration. For example, Creque et al. (2005) used habitat models to predict sport fish densities throughout the Michigan lower peninsula, but they needed an exceptionally large database of observed fish densities (the Michigan Rivers Inventory; Seelbach and Wiley 1997) to do so. The large numbers of parameters that are included in bioenergetics models also can make them difficult to use. For instance, the well-known Wisconsin model requires users to specify from 15 to >30 input parameters (Hanson et al. 1997). This parameter proliferation makes model calibration difficult, can inflate model uncertainty (Ney 1993), and can undermine user confidence in model predictions (Schnute and Richards 2001).
Given these concerns, some authors have suggested placing a greater emphasis on comparatively simple models (Pace 2001, Adkison 2009). The concept of a simple fish density model is appealing, but what exactly would it look like, and what predictive capabilities could it offer? We draw from our own modeling experiences and research interests to offer some specific recommendations. We suggest that a simple model should include as few parameters as possible to ensure that it is truly simple to use. We also recommend that the model be of a general structural form, so that it can be applied in many different types of systems. To ensure that the model is reliable, we would expect it to predict fish densities with levels of accuracy and precision that are comparable to more established methods. Last, we propose that the model should be scalable. That is, it should be capable of predicting fish densities at moderate to large spatial scales from data that were collected at relatively small scales (Wu and Li 2006).
A class of models that satisfies many or all of these expectations has emerged recently. These models use robust statistical patterns, such as the allometric relationship between body size and metabolism, to scale from individuals to higher levels of organization (e.g., populations or communities; Ernest et al. 2003). Hence, we refer to them as macroecological models (sensu Brown 1995). Macroecological models are used primarily in upscaling contexts (to predict large-scale phenomena from small-scale data; Marquet et al. 2005), so they often lack the resolution to address tactical management questions (e.g., determining whether a riparian restoration project is likely to enhance recruitment or whether the age-class distribution within a fishery is likely to change). These issues are best left to more traditional habitat and bioenergetics models. However, macroecological models are remarkably efficient tools for studying and predicting ecosystem structure and function at relatively coarse scales (Marquet et al. 2005, Brown et al. 2007). For example, Jennings and Blanchard (2004) used the self-thinning relationship (see next paragraph) and an empirical estimate of primary production to predict the total biomass, or carrying capacity, of large fishes within the North Sea. Their basic method was then upscaled by using remotely sensed primary productivity data to predict fish biomass at the global scale (Jennings et al. 2008).
We tested the idea that a simple model could be used to predict fish densities in a variety of streams and rivers. Two macroecological assumptions formed the core of our model. First, we assumed that population density (N) and the mean body mass (M) of individuals within a population are inversely related (N 1/M; White et al. 2007). Known as self-thinning, this inverse relationship is expected for any resource-limited population because smaller individuals will, on average, consume fewer per capita resources than larger individuals, and therefore, can attain higher densities (Fréchette and Lefaivre 1995). The self-thinning relationship is used frequently in models and is expressed as a power function of the form N = aM−b, where log(a) and b are, respectively, the intercept and slope of an N vs M regression (when calculated for log-transformed data; Marquet et al. 2005, Brown et al. 2007, White et al. 2007). The exponent b has been shown on both empirical and theoretical grounds to be ~0.75 (Carbone and Gittleman 2002, Savage et al. 2004), although moderate deviations from this value are common (e.g., b ≈ 1.0; Bohlin et al. 1994, Rincón and Lobón-Cerviá 2002).
Second, we assumed that fishes at a given trophic level will assimilate only a fraction of the energetic resources available to them (i.e., 1 trophic level below them), and that this fraction is nearly constant. Commonly referred to as the trophic transfer efficiency (ε), this fraction is often used in foodweb studies to predict or back-calculate production at different trophic levels (Schulz et al. 2004). ε is defined as the fraction of production at a given trophic level that is converted to production at the next highest trophic level (Lindeman 1942, Jennings and Blanchard 2004). The ε phenomenon can be explained with the following chain of logic: 1) predators are larger than their prey, 2) so predators have higher metabolic (respiratory) demands, and 3) less energy will be available for predator production (Lindeman 1942, Brown et al. 2007). Considerable debate addresses the question of whether ε demonstrates a consistent central tendency (Strayer 1991, Barnes et al. 2010). Nevertheless, ε ≈ 0.1 has been a robust approximation in many aquatic systems (Slobodkin 1960, Pauly and Christensen 1995).
We combined these 2 assumptions with literature data on primary productivity, fish assemblage structure, and fish feeding behavior to model fish densities in 4 warm-water and 4 cold-water systems. We predicted the density of the top predator species in each test system and assessed the accuracy and precision of the model by comparing the predicted densities with observed fish densities. We conclude by discussing some strengths and limitations of the model and by showing that the model can be applied in a number of novel contexts, such as regional-scale estimation of fish densities or predicting reference conditions.
Methods
Basic model structure and assumptions
We began with McGill's (2008) model of North American bird abundances, which used the self-thinning relationship to predict abundance, given remotely-sensed primary production data and literature descriptions of species' body sizes and feeding behaviors. McGill (2008) modeled the abundance of individual species (Ni) as
where ε is trophic transfer efficiency, Ti is the trophic level of the ith species, Ci is the fraction of available resources consumed by species i, NPP is net primary production, Mi is the mean body mass of species i, b is the self-thinning exponent (assumed by McGill 2008 to be 0.75), and z is an index of energetic equivalence (z = 1 when all species consume energetic resources at an equal rate).This simple model was a good starting point for our fish model because it assumed that resource availability (represented by the numerator in eq. 1) and density-dependent regulation (represented by the self-thinning term in the denominator of eq. 1) are primary determinants of population abundance. However, we made several modifications to McGill's (2008) procedure. First, we replaced T with T − 1. This correction more accurately reflects the energetic resources remaining at a given trophic level, when primary production occurs at T = 1 (Lindeman 1942). For instance, if NPP = 1000 g C ha−1 y−1 and ε = 0.1, then 100 g C ha−1 y−1 will remain at T = 2 (i.e., 1000 × 0.12 − 1), 10 g C ha−1 y−1 will remain at T = 3 (i.e., 1000 × 0.13 − 1), and so on.
Second, we focused solely on the densities of top predator species and did not attempt to partition resources among species within a shared trophic level. By doing so, we eliminated the need to specify C and z values because both were assumed to be 1. This assumption will not significantly influence the model predictions when the highest trophic level is dominated by a single apex predator. However, it is likely to bias the model if multiple predator species are present in a given trophic level, and they are not of similar size or do not partition resources at a nearly equal rate. Therefore, we suggest that our model is appropriate for 1 of 2 purposes: 1) predicting population densities in systems where a top predator has been clearly identified, or 2) predicting composite densities of ≥2 co-occurring predators with similar body masses and feeding behaviors when both are abundant but neither is consistently dominant.
We also noted that McGill's (2008) model converted NPP to N without first accounting for units of time (the N values were standing stocks, but the NPP estimates were annual fluxes). Thus, McGill (2008) assumed the ratio between production and standing stock biomass (P/B) was effectively 1. In some instances, P/B for freshwater fishes is ~1 (Chapman 1978, Huryn 1996). However, Randall et al. (1995) reviewed fish production data from 51 lotic systems and found the average P/B = 1.63. Therefore, we included this value of P/B as an explicit term in the model. In this way, εT − 1 was used to estimate fish production as a fraction of NPP, P/B was used to estimate fish biomass from fish production, and M−b was used to predict fish density from standing stock biomass and mean body mass.
We modeled the densities of top predator species as
where NPPww is net primary production in grams wet mass (see below), PB is a constant P/B (assumed to be 1.63; see Randall et al. 1995), and all remaining notation is as shown in eq. 1. The subscript i was removed (from eq. 1) because we focused solely on top predators and did not attempt to predict the densities of multiple species in a given system. We estimated NPP as the sum of autochthonous (in-stream production) and allochthonous (terrestrial leaf litter) resources with data from the primary literature (references in Table 1). To account for the differing nutritional quality of these resources, we used the mean assimilation efficiencies of Pandian and Marian (1986). We multiplied autochthonous production by 0.47 and allochthonous production by 0.15 and summed the results to estimate total NPP. All NPP data were in g C ha−1 y−1 and were standardized to 1 ha of stream-channel surface area. When necessary, we used an atomic mass conversion to obtain g C from g O2 (C = O2 × 0.375; see Lamberti and Steinman 1997). We converted NPP estimates from g C to g wet mass (NPPww) with a conversion factor of 10 (1 g C = 10 g wet mass of consumer tissue; Waters 1977).Table 1
Fish and net primary production (NPP) data used to parameterize the fish density model. Mean body masses (M) are in g, and all NPP data (originally in g C ha−1 y−1) have been converted to g wet mass ha−1 y−1 (NPPww). When necessary, mean daily production estimates were converted to annual values assuming a 7-mo growth period or 210 d of primary growth. T = trophic level.
Whenever possible, we took M values directly from the primary literature (references in Table 1). When average length data were provided in lieu of direct mass measurements, we used published length–mass regressions (references in Table 1) to predict M from mean length. T values were inferred from species feeding descriptions in regional atlases (Behnke 1992, Jenkins and Burkhead 1994). We assumed T = 3 for primarily insectivorous fishes (trout) and T = 4 for piscivores (bass and pikeminnow).
Estimating b and ε
Direct estimates of NPP and M were available in each of the test systems, but b and ε estimates were not. Therefore, a Monte Carlo (MC) procedure was used to sample from a range of probable b and ε values. To do so, we first compiled empirical, baseline b and ε distributions from the primary literature (available from DJM upon request). We then drew 1000 random samples (with replacement) from each of the baseline b and ε distributions and used these samples to run 1000 MC simulations in each test system. In each simulation, we assumed that b and ε were independent.
We obtained b estimates from 59 freshwater systems (Egglishaw and Shackley 1977, Elliott 1993, Grant 1993, Bohlin et al. 1994, Cyr et al. 1997, Dunham and Vinyard 1997, Grant et al. 1998, Steingrímsson and Grant 1999, deBruyn et al. 2002, Knouft 2002, Rincón and Lobón-Cerviá 2002, Cohen et al. 2003, Keeley 2003). These data included b values for individual species and multispecies assemblages and represented both warm-water (e.g., bass in the Saint Lawrence River, Québec) and cold-water (e.g., trout in western US streams) systems. The median value and coefficient of variation (CV) for the resulting b distribution were 0.86 and 0.25, respectively (Fig. 1A).
To estimate ε, we used the distribution shown in fig. 2 of Pauly and Christensen (1995). This distribution was based on estimated production rates in 48 aquatic food webs, where individual ε values were calculated for each transition between adjacent trophic levels (T = 1 vs T = 2, T = 2 vs T = 3, etc.; total n = 140). The median ε and CV were 0.10 and 0.53, respectively (Fig. 1B). This distribution encompassed most of the ε values that have been reported by other authors (e.g., Lindeman 1942, Strayer 1991, Barnes et al. 2010).
Testing the model
To test the fish abundance model, we searched the literature for observed (i.e., field-measured) fish and NPP data that satisfied 5 criteria. First, we focused on systems in which observed estimates of fish density and NPP were available from the same stream or river or from similar habitats within a relatively homogeneous environment (e.g., forested headwater streams in the Cascade Mountains of western Oregon). Second, we limited our selection to studies that used true population estimation techniques (e.g., mark–recapture or multiple-sample depletion methods; Schwarz and Seber 1999). Third, we selected studies that identified a single top predator or a composite of functionally similar predators with comparable body masses. Fourth, we used only population estimates that included age-structured data because juvenile fish densities often exceed adult densities by a large margin (Neves 1981) and many species exhibit ontogenetic shifts in feeding behavior (i.e., juveniles occupy different trophic levels than adults; Matthews 1998, Mittelbach and Persson 1998). Therefore, we treated juvenile (age-0) and adult (age-1+) fishes as distinct species in our model. Fifth, we did not use data from highly exploited or degraded systems because they would tend to have artificially reduced population densities, nor did we use data from systems that had been stocked recently.
Eight systems satisfied each of the criteria while providing a mixed sample of warm-water and cold-water fishes (Table 1). Warm-water systems included 3 smallmouth bass (Micropterus dolomieu) populations (Speed River, Ontario; eastern Oklahoma streams; Rappahannock River, Virginia), and the Colorado pikeminnow (Ptychocheilus lucius) population of the lower Green River, Utah. Cold-water systems included trout populations in western Oregon streams (rainbow trout, Oncorhynchus mykiss, and cutthroat trout, Oncorhynchus clarkii), eastern Idaho streams (Yellowstone cutthroat trout, O. clarkii bouvieri), Spring Creek, Pennsylvania (brown trout, Salmo trutta), and the White Mountains of northern New Hampshire (brook trout, Salvelinus fontinalis).
Predicting the densities of relatively large fishes
The test data sets did discriminate between juvenile and adult fishes, but the observed mean adult body masses (M) were conspicuously low in most of the test systems. The observed M values for smallmouth bass were all <90 g, and the observed M values for trout were <60 g in 3 of the 4 cold-water systems (Table 1). These low M values reflected the fact that fish abundance and body mass are often linked by an exponential decay function (the Allen curve; Chapman 1978). Mean adult body mass is low in most fish populations, even when juveniles are excluded, because smaller, younger adults (mostly age 1) tend to be much more abundant than larger, older (age 2+) fishes (Mathews 1971, Elliott 1994). Using the observed, albeit low M values was prudent for model testing because the original authors reported a single mean body mass for all adult fishes (i.e., we had no way to calculate the mean body mass of a subpopulation of larger fishes, given the reported data). Substituting larger M values into eq. 2 would have biased the model toward lower density predictions and probably would have impaired fits between predicted and observed densities.
However, the ability to predict the densities of particular subpopulations, such as large, harvestable fishes, could be of value to fisheries managers. Therefore, we used the following procedure to estimate M for relatively large fishes or subpopulations within each test system. First, we used age-specific growth rates from Vanicek and Kramer (1969), Carlander (1969, 1977), and Osmundson et al. (1997) to calculate average annual cohort body masses (Mj) for each species of interest. We then calculated the average body mass of each species as
where amax is a species' maximum age (y), Nj and Mj are the density and mean body mass of the jth cohort, respectively, and . Age at maturity and maximum age were determined for each species with data from Vanicek and Kramer (1969), Carlander (1969, 1977) and Osmundson et al. (1997), whereas b was sampled randomly from the distribution shown in Fig. 1A. M values predicted by eq. 3 were then entered into eq. 2 to predict the densities of relatively large fishes. We did this procedure 1000 times in each test system. Each of the 1000 simulations was done independently (i.e., b and ε were randomly selected in each simulation); but in each simulation, the same b value was used in both equations 2 and 3. Importantly, we included only age-2+ cohorts in our simulations. However, eq. 3 can be used to estimate M for any discrete subpopulation of interest (e.g., an annual cohort or range of cohorts).Results
Basic model predictions
In general, model accuracy and precision were highly variable and tended to differ between warm-water and cold-water systems. Model accuracy was very high in the Speed River, where the median predicted density of smallmouth bass was within 10% of the observed density (Fig. 2A). However, model precision was low in the Speed River. For example, the interquartile range (IQR) of model-predicted densities encompassed 1.6 orders of magnitude. Model accuracy was lower in eastern Oklahoma streams and the Rappahannock River, where the median predicted smallmouth bass densities exceeded the observed densities by 104% and 111%, respectively. Model precision also was low in these 2 systems, where the IQRs of the predicted densities encompassed 1.0 to 1.2 orders of magnitude. However, low precision also was characteristic of the observed densities in these systems, where the reported 95% confidence intervals (CI) encompassed 0.7 to 1.0 orders of magnitude and largely overlapped with the model-predicted IQRs.
Model predictions in the Green River were less accurate than in the other warm-water systems. The median predicted density of Colorado pikeminnow exceeded the observed density by 194% (Fig. 2A). Model precision also was low, but comparable to the other warm-water systems. For instance, the IQR of model-predicted densities encompassed 1.4 orders of magnitude. However, model precision seemed particularly low when compared with the narrow 95% CI of the observed density estimate.
Model accuracy generally was higher in cold-water systems. In eastern Idaho streams and Spring Creek, the median predicted trout densities were within 27% and 19% of the observed densities, respectively (Fig. 2B), and the 95% CIs of the observed densities overlapped with the median predicted densities. Model accuracy was lower in White Mountains streams, where the median predicted trout density exceeded the observed density by 75%. However, the reported 95% CI for the observed density did overlap with the median predicted density in White Mountains streams. Model accuracy was lowest in western Oregon streams, where the observed trout density was 248% less than the median predicted density. Western Oregon streams were the only system in which the observed fish density did not occur within the IQR of the model-predicted densities. (A 95% CI was not provided for the western Oregon trout data.) Model precision also was low in each of the cold-water systems, where the IQRs of model-predicted densities encompassed 0.6 to 1.0 orders of magnitude. However, precision was higher in cold-water than in warm-water systems (Fig. 2A, B).
Optimal b and ε values
In all systems except the Speed River, the model tended to overestimate the observed fish densities. This bias generally was more pronounced in warm-water than in cold-water systems (Fig. 2A, B). Nevertheless, the consistency of this bias caused us to wonder if optimal parameter values could be identified. Therefore, we used a retrospective procedure to determine which b and ε values led to the most accurate model predictions. We selected all model-predicted densities that were within ±50% of their respective observed densities and examined the b and ε values that were used in each of the corresponding MC simulations. We also used 2-sample Mann–Whitney tests (2-tailed) to determine whether the medians of optimal b and ε values differed significantly from the medians of the baseline distributions shown in Fig. 1A, B. We used a nonparametric test because the numbers of observations within the baseline b and ε distributions and the optimal parameter data sets (see n in Table 2) differed and the optimal parameter data were not normally distributed.
Table 2
Ranges of the self-thinning exponent (b) and trophic transfer efficiency (ε) values that were used when the model-predicted fish densities were within ±50% of the observed densities. Median values and coefficients of variation (CV) are shown for each test system. p-values are shown for Mann–Whitney (2-tailed) test comparisons with the baseline b (median = 0.86, CV = 0.25) and ε (median = 0.10, CV = 0.53) distributions shown in Fig. 1A, B. n = number of Monte Carlo simulations (of 1000) in which the predicted density was within ±50% of the observed density.
This process revealed 2 notable patterns (Table 2). First, the optimal b values were significantly larger than the baseline b values in 5 of 8 systems, but none of the optimal b values were significantly smaller. Second, the optimal ε values were significantly smaller than the baseline ε values in 5 of 8 systems, but none were significantly larger than the baseline values. Furthermore, these differences were not unique to warm-water or cold-water systems. These results suggest that b in freshwater fish populations might be greater than the often cited value of 0.75 (e.g., Carbone and Gittleman 2002, Savage et al. 2004). Our results also suggest that ε might be less than the often cited value of 0.10 (e.g., Slobodkin 1960, Pauly and Christensen 1995) for predatory freshwater fishes.
Large fish densities
Mean body mass (M) increased considerably in 5 of the 8 test systems when eq. 3 was used to estimate M within subpopulations of large fishes (Table 3). For instance, M increased from 195.0 g to 567.2 g for smallmouth bass in the Rappahannock River and from 48.6 g to 93.6 g for Yellowstone cutthroat trout in eastern Idaho streams (see original M values in Table 1). However, average body mass did not increase in the Green River, Spring Creek, or the White Mountains, where the originally reported M values (Table 1) were typical of large fishes. For example, Carline et al. (1991) reported a mean adult M of 177.0 g for Spring Creek brown trout, whereas eq. 3 predicted M = 168.5 for brown trout. Therefore, we excluded eq. 3 results for the Green River, Spring Creek, and the White Mountains from the remainder of our analyses.
Table 3
Mean body masses (M) in subpopulations of large fishes and their predicted densities (N). Equation 3 was used to estimate M for large (age 2+) fishes. M results were then used in eq. 2 to predict N for subpopulations of large fishes. Each reported M value is the mean of 1000 Monte Carlo (MC) simulations, with standard deviations shown in parentheses. All M values are in g wet mass. All N are in number/ha, rounded to the nearest integer. M results were approximately normally distributed and are shown as means, but N results (based on 1000 MC simulations) were right-skewed and are shown as percentiles. Nsimilar values are densities of large fishes that were observed in similar, but independent systems (see Results and footnotes below).
As expected, when the M estimates for large fishes were used in eq. 2, the model-predicted densities (N) decreased substantially (Table 3). On average, the median predicted N decreased by 80% in warm-water systems and by 58% in cold-water systems. Because we did not have observed N data that were specific to subpopulations of large fishes within these systems (i.e., the observed N values shown in Fig. 2A, B corresponded to the smaller M values shown in Table 1), we could not directly assess the accuracy of these lower N estimates. However, anecdotal evidence did suggest that the large fish estimates were reasonable in both warm-water and cold-water systems. For instance, based on a length–mass regression from Carlander (1977) for smallmouth bass in Ozark streams, we estimated that a 569.2-g (the mean M predicted for eastern Oklahoma streams; Table 3) smallmouth bass would have a total length (TotL) of ~345 mm. We then used length–frequency data from Reed and Rabeni (1989; their fig. 1) to estimate the percentage of all smallmouth bass that were ≥350 mm TotL in Ozark streams. When we multiplied this fraction by the total reported density (138 fish/ha; table 3 in Reed and Rabeni 1989), we estimated that ~16 large (≥350 mm TotL) smallmouth bass/ha were present in Ozark streams. This estimate was bracketed by the 25th and 50th percentiles of our model-predicted densities for large fishes in eastern Oklahoma (Ozark) streams (Table 3). We obtained similar results for smallmouth bass in the New River (Virginia and West Virginia), cutthroat trout in a Molalla River tributary (northwestern Oregon), and Yellowstone cutthroat trout in Marquette Creek (northwestern Wyoming). In each of these systems, the observed density of large fishes was bracketed by the 25th and 50th or the 50th and 75th percentiles of our model-predicted densities (Table 3). However, we did not find a comparable estimate of smallmouth bass densities within the Speed River (Ontario) region.
Discussion
Our objective was to determine whether a simple model could predict lotic fish densities in an efficient, yet reliable manner. In general, we found that it could. The model predictions were highly accurate (see below) in ½ of the test systems, and the IQRs of the model predictions bracketed the observed fish densities in all but 1 system (Fig. 2A, B). However, model accuracy was highly variable among systems, and model precision was consistently low. Therefore, we proceed with a more critical assessment of model accuracy and precision and provide specific recommendations for applying the model.
Model performance—is the model reliable?
Model accuracy
Whether the model is good enough to be applied is a subjective decision. Overall, we think that model accuracy was good in 3 of the 4 trout systems (eastern Idaho streams, Spring Creek, and White Mountains streams) and in the Speed River (Fig. 2A, B). Model accuracy was moderate in eastern Oklahoma streams and the Rappahannock River and poor in western Oregon streams and the Green River. We also acknowledge that model precision was generally low. However, no universal rules exist for accepting or rejecting the model (Pace 2001, Schnute and Richards 2001). Therefore, we suggest that comparisons with other field-measured data sets are a good starting point. These comparisons should provide objective benchmarks for evaluating the model. In this way, we can at least determine whether the model performs as well as standard field methods.
The studies of Mahon (1980) and Rodgers et al. (1992) were 2 particularly helpful benchmarks. Mahon (1980) estimated fish densities in 10 warm-water streams with a conventional field method (multiple-sample depletion; Schwarz and Seber 1999). Then he used rotenone to do a complete census in each stream and, therefore, was able to gauge the accuracy of his field-based estimates with a very high level of certainty. Mahon (1980) found that the field estimates tended to deviate from the true population densities by an average of 27%, but errors >60% were common. Rodgers et al. (1992) tested the accuracy of 2 field sampling methods (mark–recapture and multiple-sample depletion) by doing multiple surveys in a controlled stream (the stream was first depleted of all fish, then stocked with a known number of trout). They too found that field estimates tended to deviate from the true densities by an average of ~30% and that errors >60% were common.
These studies are instructive because the reported estimation errors are generally similar to our modeling errors, suggesting that our model accuracy is comparable to the accuracy of conventional, field-based methods. For example, the absolute differences between our median predicted densities and the observed densities ranged from ±10% to 75% in 4 of the test systems (Fig. 2A, B). Model prediction errors were larger (i.e., model accuracy was lower) in eastern Oklahoma streams and the Rappahannock River, but the upper bounds of the observed 95% CIs were within 42% and 23%, respectively, of the median predicted densities. Only in the Green River and western Oregon streams did we find the model-predicted densities to be grossly in error. These errors might reflect a fundamental need for better parameter estimates, but we propose a simpler explanation. The Colorado pikeminnow is historically the largest and most dominant predator in the Green River. However, it now competes with several nonnative piscivores, including smallmouth bass and northern pike (Esox lucius), which probably consume large proportions of the available food base (Johnson et al. 2008). Therefore, the discrepancy between our model predictions and the observed Colorado pikeminnow densities (Fig. 2A) might be caused by the fact that we did not account for nonnative competitors. Partitioning the available resources among multiple species would inevitably reduce the predicted density of Colorado pikeminnow (see Expanding the model below). Similarly, cutthroat and rainbow trout are often the largest predatory fishes in western Oregon streams, but they do share habitats and energetic resources with several species of predatory salamanders. In fact, salamander biomass often equals or exceeds trout biomass in these systems (Hawkins et al. 1983). Thus, the poor fit between our model-predicted densities and the observed density of trout in western Oregon streams might have been because we did not include amphibious competitors in our calculations.
Model precision
Our model predictions were highly variable in each test system (Fig. 2A, B). Focusing on the IQRs of the model predictions reduced this variability considerably (~1.1 orders of magnitude variability for the IQRs vs ~2.8 orders of magnitude variability for the 95% prediction intervals), but the IQRs were still much wider than the 95% CIs for empirical field estimates in most systems. One way to increase model precision and accuracy would be to implement the optimal b and ε values shown in Table 2. To do so, independent data should first be collected to validate or refute these optimal parameter values. Once they have been validated, they could be substituted for or used to modify the baseline b and ε distributions shown in Fig. 1A, B. Assuming that the optimal b and ε estimates are more narrowly distributed than their respective baseline distributions, we would expect to see large gains in model precision.
However, for now, low precision appears to be a reality with which we must work. What then would be an appropriate use for our model? McElhany et al. (2010) suggested that models with wide prediction intervals (i.e., low precision) are best suited for use as relative indicators of fish abundance. For instance, a low-precision model might not be a reliable tool for estimating fish densities at specific points in space and time, but it could be suitable for predicting whether fish densities are likely to increase or decrease through time (McElhany et al. 2010). We agree that our low-precision model is well suited to make such relative predictions, but we also submit that the model is capable of predicting absolute fish densities at coarse spatial or temporal scales.
For example, we calculated CVs for the IQRs (hereafter CVIQR) of the predicted densities by removing the 1st and 4th quartiles from each set of model predictions (i.e., we calculated conventional CVs based only on the data included in the IQR). This truncation was arbitrary, but it allowed us to focus on the most probable model results, and it can be applied easily in a consistent manner. The CVIQRs, which ranged from 0.49 to 0.60 in cold-water systems and from 0.68 to 0.90 in warm-water systems (Fig. 2A, B), were very similar to CVs that have been reported for replicated field samples. For instance, Gibbs et al. (1998) compiled interannual (min 5 y data sets) population estimates for 42 salmonid (cold-water) populations and 30 nonsalmonid (warm-water) populations. The resulting mean CVs were 0.47 and 0.71, respectively. A similar interannual CV (mean = 0.49) was reported for 43 trout populations by Dauwalter et al. (2009). Also, Petty et al. (2005) showed that these CVs are typical of site-to-site variation within a given stream. They estimated trout densities at 11 sites within an Appalachian stream and found that the average among-site CV was 0.48.
Because the CVIQRs of our model-predicted densities are so similar to observed interannual and among-site CVs, we submit that the CVIQR might be a useful index of the site-to-site or year-to-year variability one would expect in natural systems. Thus, the median predicted density could be used to estimate average fish densities at regional scales (e.g., watersheds), or over multiyear (e.g., 5–10 y) intervals, whereas the CVIQR could be used to characterize natural fluctuations around that average. In this way, the model would have little relevance for fine-scale decision making, but we think it could be extremely helpful in regional or long-term planning.
Applying the model
Regional-scale prediction
Fisheries scientists recognize that large-scale phenomena cannot always be anticipated on the basis of small-scale observations and have begun to place greater emphasis on regional studies and predictive capabilities (Lewis et al. 1996). We propose that our fish density model is uniquely capable of meeting this challenge for 2 reasons. First, the model includes only 6 parameters (NPPww, ε, T, PB, M, and b) and 3 assumed constants (autochthonous and allochthonous assimilation efficiencies, and the C to wet mass conversion factor). This simplicity should significantly reduce the parameter estimation burden and make the model easier to apply. Second, the model is scalable. The self-thinning equation (M−b) is scale-invariant (Wu and Li 2006), whereas ε and T are unitless values. Therefore, the model can be applied at any scale of interest, as long as NPP can be estimated at a comparable scale and basic information on species' distributions, body masses, and feeding behaviors can be obtained.
Much of this information already is available for a wide range of North American streams and rivers. For instance, basic information on fish species' distributions and life histories is readily obtainable through regional fish atlases and online databases, such as FishBase ( www.fishbase.org), FishTraits ( www.cnr.vt.edu/fisheries/fishtraits), and NatureServe Explorer ( www.natureserve.org/getData/dataSets/watershedHucs/index.jsp). Specifying regional NPP estimates might prove more difficult. Because empirical NPP measurements require labor-intensive methods (Bott et al. 1978), the existing database is much smaller. Nevertheless, quantitative NPP estimates have been compiled for many systems that were not considered here (e.g., Bott et al. 1985, Webster and Meyer 1997). Furthermore, stream metabolism research is rapidly maturing from a primarily descriptive science to a synthetic, predictive one (e.g., Lamberti and Steinman 1997, Mulholland et al. 2001, Tank et al. 2010). Moreover, remote sensing networks, such as those scheduled for use in the Stream Observation Network Experiment (Schimel et al. 2009), have the potential to generate large quantities of NPP data. Thus, we are optimistic that the necessary NPP estimates will, in time, be available to run our model at regional or even continental scales.
Predicting reference conditions
Our model assumes that 100% of the available energetic resources at a given trophic level will be consumed and converted to fish tissue. Thus, in effect, it is a carrying capacity model (Christensen and Pauly 1998, Marquet et al. 2005). However, the model also assumes that other constraints on fish population density (e.g., habitat availability, interspecific competition, anthropogenic disturbance) are negligible. For this reason, it also could be used as a reference condition model (Hawkins et al. 2010), which is why we screened highly exploited or degraded systems from the test data sets. Such influences probably would have constrained the empirical fish densities to artificially low levels and biased the modeling results.
Reference condition models serve an important role in applied research. Models are often the best option for characterizing a natural or predisturbance state when environmental degradation is extensive or severe (Hawkins et al. 2010). This strategy has been used repeatedly in freshwater biological assessments. However, these assessments usually have sought to predict community-level metrics, such as species composition and relative abundance. To our knowledge, a reference condition model of absolute abundance (i.e., population density) has not yet been applied in a freshwater biological assessment (see Jennings and Blanchard 2004 for a marine example). Therefore, our model presents an opportunity to implement this capability. Used in this way, the model could predict expected reference densities of selected fish species throughout a drainage or region. Empirically measured deviations from these reference values could then be used to assess the severity of a disturbance.
The ability to predict carrying capacity or reference conditions also could be useful in the context of fisheries management. Anecdotal evidence suggested that the model is an effective tool for predicting the densities of relatively large fishes (see Large fish densities in Results). The mean body masses (M) that were predicted by eq. 3 are close to the sizes that managers typically regulate. For example, we used eq. 3 to estimate that a typical adult smallmouth bass weighs ~570 g (Table 3). Based on a length–mass regression in Carlander (1977), this mass equates to ~345 mm TotL. In many areas of North America, 305 mm TotL is the minimum size at which smallmouth bass can be harvested legally (Austen and Orth 1988). Equation 3 also predicted that a typical adult cutthroat trout in western Oregon would weigh ~93 g (Table 3), which equates to ~208 mm TotL (Carlander 1969). The minimum harvestable size for rainbow trout in the state of Oregon is 8 inches, or ~203 mm TotL ( www.dfw.state.or.us/resources/fishing/).
Because these regulation thresholds are so close to M predicted by eq. 3 (when length–mass regressions are used to convert the regulated lengths to body masses), values of N shown in Table 3 are essentially per-unit-area estimates of the numbers of harvestable fishes in each system. Therefore, if the total stream-channel surface area in a system were known, one could use the N values in Table 3 to predict the total standing stock or carrying capacity of the system. In particular, we suggest that the median predicted N might provide useful estimates of typical standing stocks, whereas the CVIQR might provide robust approximations of the spatial and temporal variation that managers should expect under natural or reference conditions. Applied in this manner, the model would not be particularly useful for tactical decisions, such as deciding whether to open or close a fishery within a particular stream or river, but it might be helpful in more broad-scale efforts, such as setting annual creel limits within large river basins or cataloging total fish abundances at regional scales.
Expanding the model
The model cannot currently predict disturbance effects, but we do see potential to integrate them. For instance, the self-thinning relationship is sensitive to temperature (Brown et al. 2007). Therefore, it should be possible to add temperature to the model to predict whether climate change is likely to increase or decrease the densities of selected fish species (Harris et al. 2006). Similarly, it might be possible to add to the model disturbance coefficients that could be calibrated to reflect anthropogenic disturbances, such as pollution or physical habitat degradation. In this effort, comparisons with more sophisticated bioenergetics models would be particularly helpful. For example, the Bioaccumulation and Aquatic System Simulator (BASS) is a combined individual growth and diffusion kinetics model that tracks the ingestion, excretion, assimilation, and physiological effects of a variety of metals and organic pollutants (Barber 2008). One could use BASS to estimate fish densities in systems that have and have not been exposed to a pollutant of concern, then use these differences as correction factors in our model (e.g., Hg concentrations above x reduce fish density by a factor of y).
The model also could be modified to predict the densities of multiple predator species with different body masses or of species in lower trophic levels. One would first need a list of all species that occur at a given trophic level in a particular system. An algorithm would then be needed to partition the available energetic resources (which are estimated by NPPwwεT − 1 in eq. 2) among co-occurring species. The simplest method would be to assume energetic equivalence (Bohlin et al. 1994, Brown et al. 2007) and divide resources equally among species. Alternatively, one could use direct measurements of species interactions or consumption rates to inform the resource-partitioning process. For instance, Roell and Orth (1993) quantified invertebrate consumption by 3 co-occurring predators within the New River. They found that smallmouth bass, rock bass (Ambloplites rupestris), and flathead catfish (Pylodictis olivaris) consumed 35, 31, and 10%, respectively, of the primary food resource (crayfish). These consumption estimates could be entered directly into our model to predict fish densities in multispecies systems (e.g., smallmouth bass N = 0.35 [NPPwwεT − 1M−b]/PB).
These types of modifications would increase model complexity, but they also would broaden the model's relevance and utility. Therefore, we suggest that such modifications might strike a balance between model simplicity and model applicability.
Acknowledgments
We thank John Van Sickle, Brenda Rashleigh, Tom Purucker, Seth Wenger, and 2 anonymous referees for thoughtful comments and feedback. This manuscript is a contribution to the US Environmental Protection Agency, Office of Research and Development's Ecosystem Services Research Program. It has been reviewed in accordance with the Agency's peer and administrative review policies and approved for publication. Mention of trade names or commercial products does not constitute endorsement or recommendation for use.