The goal of this paper is to help managers better understand implications of using aggregate community metrics, such as taxon richness or Indices of Biotic Integrity (IBI), for detecting threshold responses to anthropogenic environmental gradients. To illustrate, we offer an alternative analytical approach, Threshold Indicator Taxa ANalysis (TITAN), geared toward identifying synchronous changes in the distribution of multiple taxa as evidence of an ecological community threshold. Our approach underscores the fundamental reality that which taxa are affected by stressors is important, both from a conservation standpoint and because taxon-specific life-history traits help us understand relevant mechanisms. First, we examine macroinvertebrate community response to an impervious cover gradient using a well-studied biomonitoring data set to show that representative community metrics are relatively insensitive to synchronous threshold declines of numerous individual taxa. We then reproduce these response relationships using a simulated community data set with similar properties to demonstrate that linear or wedge-shaped responses of community metrics to anthropogenic gradients can occur as an artifact of aggregating multiple taxa into a single value per sampling unit, despite strong nonlinearity in community response. Our findings do not repudiate the use of community metrics or multimetric indices, but they challenge assumptions that such metrics are capable of accurately reflecting community thresholds across a broad range of anthropogenic gradients. We recommend an alternative analysis framework that begins with characterization of the responses of individual taxa and uses aggregation only after distinguishing the magnitude, direction, and uncertainty in the responses of individual members of the community.
Recent publications demonstrate a growing interest in the application of ecological thresholds for natural-resource management (e.g., Huggett 2005, Groffman et al. 2006, Suding and Hobbs 2009). Threshold detection has broad relevance to management of aquatic ecosystems in areas such as conservation (DeLuca et al. 2008, Hilderbrand et al. 2010), biological invasions (King et al. 2007), ecosystem restoration (Walsh et al. 2005a, Martin and Kirkman 2009, Clements et al. 2010), development of numerical water-quality criteria (King and Richardson 2003, Soranno et al. 2008), ecosystem management (Richardson et al. 2007), and forecasting effects of climate change (Smol et al. 2005).
Despite the recent interest in ecological thresholds, application of the threshold concept to aquatic-resource management remains tentative, if not contentious (e.g., Gaiser et al. 2008, Richardson et al. 2008). Threshold estimation depends on the selection of response variable, assumed shape of the response, and appropriateness of the corresponding statistical model, any of which can contribute to different interpretations regarding the location of a threshold or whether a threshold exists (e.g., Walsh et al. 2005b, Moore and Palmer 2005, Dodds et al. 2010). Compounding this problem is the wide array of statistical approaches available for identifying ecological thresholds (e.g., Toms and Lesperance 2003, Qian et al. 2003, 2004, Paul and McDonald 2005, Brenden et al. 2008, Andersen et al. 2009, Sonderegger et al. 2009, Baker and King 2010). Selection of an appropriate technique for a specific application from this increasingly lengthy list might not be obvious even to experienced analysts. Thus, underlying assumptions that might influence interpretation of the results should be assessed before determining the best approach for identifying patterns of response in community data sets.
The goal of our paper is to help scientists and managers better understand the implications of using aggregate community metrics, such as taxon richness or Indices of Biotic Integrity (IBI), for detecting thresholds in responses to anthropogenic environmental gradients. Such metrics are often assumed to reflect the direction, magnitude, and shape of community response to anthropogenic stressors. To illustrate these implications, we offer an alternative analytical framework geared toward identifying synchronous changes in the distribution of multiple taxa as evidence of an ecological community threshold. Our motivation is to address a growing tendency to apply analytical methods involving community metrics that aggregate multiple taxa into a single value per sample unit without acknowledging that these aggregate values might not accurately represent responses of many contributing taxa.
Aggregating taxa can enhance signal of response to an anthropogenic gradient in some instances, but we have found that they frequently obscure nonlinear responses of multiple taxa in a community. Therefore, we suggest a community analysis framework that begins by characterizing the responses of individual taxa to decompose community behavior into its basic components before developing expectations about community-level responses from a priori aggregations of taxa (Baker and King 2010). This approach underscores the fundamental reality that we need to know which taxa are affected by stressors, both from a conservation standpoint and because taxon-specific life-history traits can help us understand relevant mechanisms.
To illustrate the potential utility of our approach for management applications, we first examined macroinvertebrate community response to an impervious-cover gradient using a well-studied biomonitoring data set (King et al. 2005, Baker and King 2010). We chose impervious cover because it is a documented cause of ecological community degradation in streams (Walsh et al. 2005b) and because conflicting biological responses have been reported using similar data (e.g., Booth 2005, King et al. 2005, Moore and Palmer 2005). We used these data to show that representative community metrics commonly used in biomonitoring are insensitive to threshold declines of numerous individual taxa revealed by our alternative approach. We then reproduced these response relationships with a simulated community data set with similar properties to illustrate underlying reasons for the pattern and their potential to confound threshold detection in other data sets. We contrasted aggregate-metric results with those from a new analytical technique, Threshold Indicator Taxa ANalysis (TITAN; Baker and King 2010). We discuss how TITAN or similar methods could be used to detect and interpret biodiversity and community thresholds more effectively.
Methods
We used macroinvertebrate community data collected from wadeable streams of Maryland's Coastal Plain (295 sites, 177 taxon abundances from 100-individual fixed counts, mostly genus-level identification). Data were collected by the staff of the Maryland Biological Stream Survey (MBSS; Klauda et al. 1998) as part of a state-wide monitoring program to assess the biological integrity of Maryland's streams. These data were used in a previous study on analytical considerations for linking watershed land cover to ecological communities in streams (King et al. 2005) and as a case study for validating TITAN (Baker and King 2010).
We used the number of Ephemeroptera, Plecoptera, and Trichoptera (EPT) taxa and Maryland's Benthic Index of Biotic Integrity (B-IBI) as community metrics in response to % watershed impervious cover (RESAC 2003). We used EPT because it and other taxonomic richness metrics are among the most widely used metrics of stream biological condition and have been used in recent threshold analyses (e.g., Evans-White et al. 2009). We used B-IBI because it combines several univariate community metrics into an aggregate index and is directly applicable to management because it is used in Maryland to classify streams according to biological condition (Klauda et al. 1998).
We used locally weighted quantile regression (loess-QR) to characterize the response of EPT and B-IBI to % impervious cover at the middle (50% quantile) and the ceiling (90% quantile) of the response distribution (Koenker 2005). We used loess-QR because it imposes minimal assumptions on response shape while contrasting different parts of the response distribution. In loess-QR, a sharp change in the response with increasing values of the predictor results in a corresponding nonlinear change in the slope of the regression line. We did not attempt to estimate statistically changepoints in EPT and B-IBI responses because scatterplots and the loess-QR evaluation provided a concordant and sufficiently clear depiction of the response to support the point of our illustration. Also, a comprehensive comparison of univariate threshold methods was beyond the scope of our paper. Loess-QR was done with the quantreg package in R.2.9.2 (R Development Core Team, Vienna, Austria).
We used TITAN to detect community thresholds, if present, with log10(x)-transformed abundances of each of 177 taxa in response to the impervious-cover gradient. Details of the TITAN method can be found in Baker and King (2010). Briefly, TITAN splits sample units into 2 groups at the value of a predictor variable that maximizes association of each taxon with 1 side of the partition. Association is measured by taxon abundances weighted by their occurrence in each partition (Dufrêne and Legendre 1997) and standardized as z-scores to facilitate cross-taxon comparison via permutation of samples along the predictor. TITAN distinguishes declining and increasing taxa and tracks the cumulative responses of increasing and decreasing taxa in the community. Bootstrapping is used to identify reliable threshold indicator taxa and the uncertainty around the location of taxon and community change points. Evidence for a community threshold is obtained through synchronous changes in the abundance of many taxa within a narrow range of predictor values. TITAN was run with the TITAN package in R.2.9.2.
Based on the results of the impervious-cover analysis, we simulated the response of 36 taxa across 200 sampling units to a uniformly distributed environmental (ENV) gradient (runif in R.2.9.2, ENV range = 0–100) to test whether, given similar taxon responses, we could reproduce the community metric response while identifying simulated, known taxon thresholds with TITAN. This simulation was similar to those in Baker and King (2010) but involved more taxa and sampling units. Briefly, taxon abundances were generated using a negative binomial distribution (rnbinom in R.2.9.2) to set different probabilities of abundance above and below a threshold value of ENV to simulate noisy, heterogeneous, and sparse site × taxon matrices typical of community data (McCune and Grace 2002). Eighteen taxa were simulated to decline (weak to strong responses), and of these, 10 taxa synchronously declined at ENV = 30, 4 at ENV = 60, and 1 each at ENV = 10, 20, 40, and 50 (Appendix 1; available online from: http://dx.doi.org/10.1899/09-144.1.s1 (10.1899_09-144.1.s1.doc)). Of the remaining 18 taxa, 9 gradually increased in frequency and abundance along ENV, whereas 9 others did not change in response to ENV. The simulation was run 6 times to assess consistency in results (additional iterations were unnecessary; see Results). In each case, we used number of taxa (NTAXA) per sampling unit as a community response variable. NTAXA was analogous to EPT used in the impervious-cover analysis and, like any variable used to detect community thresholds, should respond when ∼30% of the simulated taxa show synchronous declines at ENV = 30. We computed NTAXA for each of the sample units, assessed its distribution across ENV with loess-QR and compared it with TITAN across the 6 simulated data sets.
Results
MBSS data analysis
EPT and B-IBI showed a wedge-shaped decline in response to % impervious cover that was consistent with a factor-ceiling, linear-response model. EPT declined more sharply than B-IBI but showed little evidence in support of a threshold (Fig. 1A, C). The decline of the 50% quantile of B-IBI steepened between 18% and 28% impervious cover, suggesting a relatively rapid change in condition. However, slopes of quantiles for both community metrics were 0 or close to 0 from 0 to 10% imperviousness (Fig. 1B, D).
Figure 1
Loess quantile regressions for response of the Benthic Index of Biotic Integrity (B-IBI) (A, B) and Ephemeroptera, Plecoptera, Trichoptera (EPT) (C, D) metrics to % impervious cover (n = 295) across the full gradient (A, C) and between 0 and 10% impervious cover (B, D).

TITAN revealed numerous declining taxa (z−) between 0.5 and 2% impervious cover, whereas change points for increasing taxa (z+) were distributed from 1 to 33% (Fig. 2A, B, Appendix 2; available online from: http://dx.doi.org/10.1899/09-144.1.s2 (10.1899_09-144.1.s2.doc)). Asynchrony among increasing taxon change points and broad confidence limits produced a relatively weak aggregate signal of community change. However, synchrony among many declining taxa coupled with narrow confidence limits resulted in a sharp peak in sum(z−), a result providing strong evidence for a community threshold at ∼1% impervious cover (Fig. 2A, B).
Figure 2
Threshold Indicator Taxa ANalysis (TITAN; Baker and King 2010) of macroinvertebrate community response to % impervious cover (n = 295) showing sum(z) across the full gradient (A) and between 0 and 10% impervious cover (B), and significant indicator taxa across the full gradient (C) and between 0 and 10% impervious cover (D). TITAN sum(z−) and sum(z+) values correspond to candidate change points (xi) along the gradient. Peaks in sum(z−) correspond to locations along the gradient where synchronous declines of taxa occur, with the most substantial peak occurring at 0.93% (0.35–1.90%, 90% CI). Solid and dashed lines represent the cumulative frequency distribution of change points (xcp [thresholds]) among 100 bootstrap replicates for sum(z−) and sum(z+), respectively. In panels C and D, significant (purity ≥ 0.95, reliability ≥ 0.90, p ≤ 0.05) indicator taxa are plotted in increasing order with respect to their observed change point. Solid symbols correspond to negative (z−) indicator taxa, whereas open symbols correspond to positive (z+) indicator taxa. Symbols are sized in proportion to magnitude of the response (z scores). Horizontal lines overlapping each symbol represent 5th and 95th percentiles among 100 bootstrap replicates. Arrows identify Ephemeroptera, Plecoptera, Trichoptera taxa. (See Appendix 2 for taxon codes.)

Examination of the abundance distributions of individual taxa showed many decliners that had a sharp change in their abundance and occurrence patterns across the gradient (Fig. 3), so change points were relatively easy to determine. In contrast, increasing taxa often had a wedge-shaped proliferation pattern that led to lower confidence in the location of the change point (Fig. 3). Many of the negative threshold taxa were EPT taxa (indicated by arrows in Fig. 2D) despite the very weak decline in the aggregate measure, number of EPT taxa, between 0 and 10% cover (Fig. 1D). In fact, of the 48 EPT taxa that occurred ≥5 times in the data set, >½ had no predictable response, 18 showed threshold declines, and 2 increased in response to % impervious cover.
Simulated data analysis
The simulation was designed to mimic the patterns of increase, decline, or no change in individual taxa observed across the impervious-surface gradient (Fig. S1; available online from: http://dx.doi.org/10.1899/09-144.1.s3 (10.1899_09-144.1.s3.jpg)). Nevertheless, consistent with the EPT and B-IBI response to % impervious cover, NTAXA exhibited a wedge- or weak linear-to-threshold-shaped decline (Fig. 4). Loess-QR fits revealed relatively linear decline of both the 50th and 90th quantiles across all 6 simulations. Nonlinearities, if they occurred, appeared to be variable along ENV depending on the simulation, with the strongest evidence for a nonlinear change at ENV ≈ 60 (where 4 taxa declined) and not at ENV = 30 (where 10 taxa declined). In contrast, all strong threshold decliners (z−) were detected accurately by TITAN among simulations, and each weak threshold decliner was detected in at least 1 simulation (Fig. 5, Appendix 1). The synchronous decline of taxa was evident at ENV = 30 (Fig. 5), resulting in a clear and consistent peak in community-level change (sum[z−]) across all simulations (Fig. 6). As in the Maryland data, some of the positive-responding taxa (z+) were identified as significant increasers, but were not synchronous and exhibited wide confidence intervals consistent with a gradual increase in frequency and abundance.
Figure 4
Loess quantile regressions for responses of number of taxa (NTAXA) to the simulated environmental gradient (n = 200) for each of 6 simulated data sets, shown in rank order from 1 (upper left corner) to 6 (bottom right corner). The gray vertical line indicates the location of the simulated community threshold (ENV = 30).

Figure 5
Threshold Indicator Taxa ANalysis (TITAN) of 36 taxon abundances to a simulated environmental gradient (n = 200) for each of 6 simulated data sets, shown in rank order from 1 (upper left corner) to 6 (bottom right corner). Significant (purity ≥ 0.95, reliability ≥ 0.90, and p ≤ 0.05) indicator taxa are plotted in increasing order with respect to their observed change point. See Fig. 2 for an explanation of the figure. The location of the simulated community threshold (ENV = 30) is shown by the vertical line.

Figure 6
Threshold Indicator Taxa ANalysis (TITAN) of 36 taxon abundances to a simulated environmental gradient (n = 200) for each of 6 simulated data sets, shown in rank order from 1 (upper left corner) to 6 (bottom right corner). See Fig. 2 for an explanation of the figure. The location of the simulated community threshold (ENV = 30) is shown by the gray vertical line.

Discussion
Our results demonstrate that community metrics used in biomonitoring can be relatively insensitive measures of nonlinear community change in response to environmental gradients. Regardless of the sophistication represented by most quantitative threshold analysis techniques, nearly all suffer from a fundamental requirement of univariate community response variables, and thus, are subject to potential insensitivity when used for threshold detection. We used TITAN to quantify taxon-specific results to show that substantial changes in community structure are obscured by these summary metrics. Furthermore, through simulation and empirical analysis we demonstrated that linear (Moore and Palmer 2005) or factor-ceiling (Booth 2005, Paul et al. 2009) responses to impervious cover and other anthropogenic gradients can occur as an artifact of the response variable despite substantial nonlinearity in community structure.
What community metrics or multimetric condition indices necessarily hold in common is the a priori aggregation of responses among multiple taxa, regardless of location, direction, or magnitude of taxon response. Aggregate metrics obscured community thresholds in our analyses, and we suggest that aggregation probably will obscure thresholds in other studies because species sensitive to anthropogenic disturbance are often relatively uncommon. Thus, synchrony in their response will necessarily be diluted by ubiquitous or tolerant taxa (Fig. 7). Species occupy distinct niches by definition, but aggregation by coarse taxonomic (e.g., EPT), feeding (e.g., % filterers), habit (e.g., % clingers), or other classifications implies they are equivalent, undervalues taxonomic uniqueness (Lenat and Resh 2001), and necessarily underestimates the diversity of potential responses to different gradients. A priori aggregation might be appropriate when selected responses to a specific gradient are well understood, but could be misleading when our understanding is limited. Regardless of the inherent logic behind a particular type of metric, any aggregation of multiple taxa into a single measure per sample unit probably will suffer from similar limitations (Walsh 2006).
Figure 7
Conceptual diagram illustrating theoretical responses of different taxa to a novel environmental gradient. The value of x where it intercepts y corresponds to the minimum possible level of the anthropogenic stressor or a level that falls within the normal range of conditions experienced during evolutionary time. The response of Taxon A represents no change in its distribution along the gradient until a critical change point or zone is reached (shaded region), which leads to a nonlinear decline and eventual extirpation at a level beyond the initial change point. Taxon B represents a native taxon that is tolerant of the novel gradient and either directly (resource subsidy) or indirectly (e.g., realized niche expansion, reduced competition or predation) benefits, resulting in an indeterminate increase in its frequency and abundance among sites with increasing values of the gradient. Taxon C represents an invasive taxon that was not historically present but is able to cross ecosystem boundaries because of a variety of factors related to the novel environment. The combination of these taxon responses and individual taxa showing no response (Taxon D) into aggregate community variables can cause a misleading representation of community response to a novel gradient (e.g., Figs 1A–D, 4).

Our results suggest that aggregate metrics might not be broadly useful for detecting community thresholds, but the limitations described above do not invalidate univariate methods or their interpretation. Many analytical approaches often used to assess univariate threshold responses are unlikely to detect community thresholds as described above, but are otherwise quite appropriate for their intended applications. For example, piecewise or quantile regression might well be the analysis of choice for quantifying wedge-shaped responses in our data sets. The danger comes from interpreting their results in the context of a community threshold and especially when used as evidence for the absence of a threshold response.
Our results emphasize the need for a more explicit definition and conceptual model during analysis of thresholds in community data sets and highlight challenges limiting existing analytical methods. A clear conceptual model draws attention to discord between what we hope to identify and what we actually quantify using any analytical method. Here, we define ecological community thresholds narratively as synchronous changes in the abundance or occurrence of multiple taxa within a larger assemblage (among a population of sites) occurring along an environmental gradient. Numerically, synchronous change points of multiple taxa are estimated with a high degree of statistical certainty by sum(z) in TITAN. However, community-level thresholds revealed by TITAN should not be construed to mean that no change occurs elsewhere along the gradient. Individual taxon thresholds might deviate from a community threshold, and such taxon responses could be early-warning signals if they decline below the community threshold, or indicative of additional degradation if they occur above the threshold. For example, in Fig. 2C, D, several taxa declined between 2 and 10% impervious cover despite the large synchronous decline near 1%. Thus, a relatively large change occurred at low levels of urbanization, but these results also revealed a continuum of declining taxa with increasing impervious cover.
We caution against the assumption that thresholds will exist under all circumstances. Instead, we advocate for quantifying taxon-specific responses and aggregating signals rather than seeking signal in aggregates. However, we do not favor aggregation of threshold indicator taxa to create new, stressor-specific univariate metrics, such as % sensitive taxa, unless the objective in doing so is clear. We suggest that far more information is available for interpreting community-level change by examining the taxon-specific output and associated life-history traits and evolutionary relationships of these individual taxa. Furthermore, we contend that methods for identifying ecological community thresholds must quantify and communicate inherent uncertainty (confidence or credibility intervals) associated with the existence and location of any threshold estimate.
We developed TITAN specifically to address the problems described in our paper. The method has performed well in simulations and retrospective analysis of real data sets. It can distinguish the response direction, magnitude, location of change in individual taxa, and provide an assessment of uncertainty about the location and synchrony of taxon change points as evidence for community thresholds. The program is available for unrestricted use by state or federal researchers and managers using the free statistical package R and includes documentation and examples to help users apply the technique (Baker and King 2010). TITAN is not the only method available, but we hope the philosophy behind our approach will encourage further development, exploration, and discovery. We think threshold analyses offer unrealized potential for linking evolutionary context and life-history characteristics to ecology and environmental management because they emphasize more explicit conceptual models and concordant analytical methods.
Acknowledgments
Both MEB and RSK contributed equally to the contents of this manuscript. We thank Allison Roy and 2 anonymous referees for helpful comments on an earlier version our paper. Some of the ideas expressed in this paper were developed through subcontracts to RSK and MEB through a grant from US EPA Science to Achieve Results program, agreement #R-82868401 to S. Prince and D. Weller and a grant from US EPA Region 6 to RSK, agreement CP-966137-01.