Modeling the spatial effects of disturbance: a constructive critique to provide evidence of ecological thresholds

Biologists and conservation planners are frequently asked to evaluate the spatial effects of anthropogenic disturbance on species of conservation concern. The linear response of a demographic parameter, such as survival or abundance, to the distance-from-disturbance is often used to inform spatial restrictions on development. The linear response, we argue, does not model the most common biological mechanisms that cause changes to demographic parameters, nor does it provide an estimate of a threshold that planners could use to protect species of concern. In the Great Plains of North America, biologists are increasingly concerned about the impact of energy development on populations of four species of grouse. To address this gap in our ability to properly assess distance thresholds, we developed a framework of four response patterns (null, linear, stair step, ramped) to describe the potential effects of a disturbance on biological processes relevant to nesting grouse located along a gradient from the disturbance. We simulated position and survival of grouse nests along a 25-km disturbance gradient to mimic the response to disturbances. We evaluated the relative support for a set of linear and nonlinear models in a known fate analysis of nest survival. Each of the underlying response patterns was detected with an appropriate model in a model selection framework (ωAIC = 0.61–0.75) when the sample size of nests was high (n = 500), and thresholds were identified when present. In a low sample size scenario (n = 50 nests) that may be typical of shortterm empirical sampling schemes, the stair step threshold was detected, but the more complex, ramped threshold was not detected. We provide recommendations regarding study design and inference for ecological and policy thresholds, and we encourage researchers to be cautious about the manner in which threshold responses are assessed and described.

The ecological literature (May 1973, Francesco Ficetola andDenoël 2009) has long been intrigued with the concept of thresholds or points of abrupt change in ecological conditions (Huggett 2005). The threshold concept has been applied in many areas of ecology, but especially in the study of community ecology (Francesco Ficetola and Denoël 2009) and landscape fragmentation (Olden 2007) in the context of temporal changes. Currently, the study of disturbance ecology (e.g. energy development) provides the opportunity to apply the concept of thresholds to spatial disturbance gradients. The presence of a threshold distance, at which a response parameter (e.g. nest survival, species richness or abundance) is no longer affected by a disturbance, is of ecological interest, but is also critical for planning future locations of anthropogenic disturbances (e.g. energy developments, US Fish and Wildlife Service 2012). However, there is a surprising gap in the ecological literature (Andersen et al. 2009) on analytical techniques that can robustly estimate a defendable distance to be used for such planning. For example, how far does a pronounced effect on abundance reach from an energy facility? Or, how close can an energy facility be constructed to critical nesting habitat without causing a disturbance?
Energy development is a global phenomenon with the potential to significantly affect populations of wildlife (Hebblewhite 2008, Smith andDwyer 2016). Here, we explore the dynamics of disturbance thresholds using a case study of grouse in the Great Plains of North America. Grasslands in this region have recently become targets for rapid development of the wind energy industry because of high potential wind speeds (Fargione et al. 2012). Oil and natural gas extraction also occurs in grasslands, and grassland birds are the most rapidly declining avian group in North America (Vickery and Herkert 2001). All four species of grouse that occur in the grasslands of North America, greater sage-grouse Centrocercus urophasianus, greater prairie-chickens Tympanuchus cupido pinnatus, lesser prairie-chickens T. pallidicinctus, and sharp-tailed grouse T. phasianellus, have been studied to characterize the protective buffer zones to be established around critical habitat in relation to energy development (Connelly et al. 2000, Pruett et al. 2009, Williamson 2009, Winder et al. 2014a. In many cases, best-guess and well-intentioned suggestions for space needed to protect a species of concern (Connelly et al. , Pitman et al. 2005, Hagen et al. 2011 have become entrenched in policy guidelines (US Fish and Wildlife Service 2012). Such studies may indeed provide evidence for a general effect of anthropogenic features on movement or survival. However, they were not designed or analyzed in such a fashion to provide robust determinations of distance-based threshold responses that would guide citing policy for the location of energy development or other spatial disturbances.
For example, Pitman et al. (2005) provided an analysis of nest sites of lesser prairie-chickens with regard to anthropogenic features. The study compared the proximity of nests and random points to houses, power lines, and similar features by assessing the mean distance of the closest 10% of nests and random points. The analysis allowed the assessment of general avoidance but did not allow the assessment of a threshold response. So, Pitman et al. (2005) provided two comments in an apparent attempt to provide guidance for buffer distances for development: 1) "We seldom found lesser prairie-chicken nests within 400 m of transmission lines or improved roads, even though sand-sagebrush prairie near these features appeared similar to the surrounding area," and 2) "We concede that the impact of a house may not equal that of the power plant; however, we did not have multiple units of each for analysis. A nonstatistical review of the nest location data suggests that the impact of houses extended to a radius of 0.5 km, whereas that of compressor stations and the power plant extended to over 1 km". There is no doubt that Pitman et al. (2005) provided evidence for avoidance of anthropogenic features, but at what distance? The US Fish and Wildlife Service (2012) provided the following summary of Pitman et al. (2005): " Pitman et al. (2005) found that transmission lines reduced nesting of lesser prairie chicken by 90 percent out to a distance of 0.25 miles [∼400 m], improved roads at a distance of 0.25 miles [∼400 m], a house at 0.3 miles [∼0.5 km], and a power plant at  0.6 miles [∼1 km]". In fact, none of the threshold values cited by US Fish and Wildlife Service (2012) were based on Pitman et al.'s (2005) assessment of reduced nesting probability, and Pitman et al. (2005) never referred to a 90% reduction in nest selection -that reference was apparently a non sequitur derived from the sample design: 90% of the nests and random points were not used in Pitman et al.'s (2005) analysis. Hagen et al. (2011) used the same type of analysis to describe local impacts of disturbance, but the anecdotal comments that described nest location became accepted as a threshold that is now policy lore.
How should we plan studies to assess threshold responses? The traditional study design to evaluate effects of a disturbance has been referred to as before-after-control-impact (BACI; Morrison et al. 2008;Fig. 1), and is useful when a disturbance occurs throughout a landscape, such as forest harvest or prescribed burning (Powell et al. 2000). Control sites that are not impacted by a treatment provide spatial controls, and the before-after design provides temporal controls. However, this design has no potential to provide information on thresholds at which a disturbance effect ameliorates. Further, disturbances such as roads and energy development are linear or point-source in nature and not suitable for the application of a BACI design. The study design that should be used to evaluate the effects of distance from a disturbance is an impact gradient design (IGD, US Fish and Wildlife Service 2012). If the IGD is used before and after the disturbance is created (making it more robust), it is a before-after-gradient (BAG) design (Ellis and Schneider 1997; Fig. 1). The temporal implementation of many disturbances may be unforeseen, and economic situations often result in changes in timing for energy development, which may make a before-after study impractical (US Fish and Wildlife Service 2012). Greater prairie-chickens have recently been studied using the BAG design (McNew et al. 2014, Winder et al. 2014a) and the IGD design (Harrison 2015, Whalen 2015. The evaluation of thresholds based on point disturbances, by definition, involves the identification of a discontinuity or change in trend (Muradian 2001) in a response variable over distance from the disturbance. considered a linear effect of distance-to-turbine in the context of a gradient-type study design at wind energy facilities, although no evidence for an effect of wind turbines was found in any of the studies. While a linear response (g  b 0  b 1  distance) of distance from disturbance is a potential hypothesis to consider and could be evaluated relative to other nonlinear models (Harrison 2015), the linear model does not provide the potential to develop a threshold (Francesco Ficetola and Denoël 2009) that could be used by planners to create spatial policy (Ellis and Schneider 1997). Thus, other nonlinear models should be considered to describe underlying processes and determine Figure 1. Comparison of before-after-control-impact (BACI), before-after-gradient (BAG) and impact gradient design (IGD) experimental designs in the context of wind energy development; empty ovals indicate the planned site for wind turbines before the treatment occurs. Impact in BACI design is throughout a given area; BAG design considers impact as point-source at the beginning of a gradient (0 -x km). IGD design is used when the temporal control (before) is not possible.
Linear models that describe changes in a probability (e.g. nest survival, lek persistence) along a gradient may appear to take on nonlinear shapes when the probability at the disturbance or away from the disturbance closely approaches values of 1.0 (Fig. 2). The nonlinear nature of a linear model is caused because the model is fitted with a logit link function (Francesco Ficetola andDenoël 2009, Powell andGale 2015); although the underlying model is linear at the logit scale, the probability cannot exceed 1.0, so the value asymptotes at the extremes with the back-transformation from the logit scale. Such results provide evidence of lower survival near a disturbance, but the underlying structure of the model does not carry the hypothesis that a threshold exists.
Some biologists have attempted to show critical thresholds through the use of discrete (near/far) comparisons of demographic parameters for a sample of animals based on their proximity to energy development. Holloran et al. (2010) used a discrete comparison of annual survival of female sage-grouse in the vicinity of an energy development, and the discrete categories (break point: 950 m) were decided a posteriori based on the distribution (related to expected values) of nest locations along a gradient. In contrast, Lyon and Anderson (2003) used a study design in which they radiomarked female sage-grouse at leks that fell into two a priori distance categories (break point: 3 km); no biological justification for the categories was given, which suggests a hypothesis that disturbance extends from energy development to a 3-km distance. The 3-km distance was subsequently used in policy statements (US Fish and Wildlife Service 2012), nests in the ramped threshold were assigned a daily survival of 0.98 (S w  0.98 7  0.8681), for distances  5000 m; nests  5000 m were assigned a weekly survival probability as S w  [0.94  (0.000008x)] 7 (Fig. 3J).
We used a four-week nesting period to approximate the length of the nesting period of any of the four species of grouse in the Great Plains of North America. Each week, a random number (0.0  y  1.0) was drawn for each nest. If y  S w , the nest was successful for that week, whereas the nest failed if y  S w . We created a known-fate capture history (live/dead: LDLDLDLD; White and Burnham 1999) for each nest based on its at-risk status and success/ failure status during each time period during the four-week nesting period. Thus, we constructed eight simulated sets of capture histories: one set for each of the four response patterns at contrasting sample size scenarios (n  500 and n  50 nests).

Analysis
We proposed five models to represent hypotheses that could be posed during similar analyses of grouse nest survival near wind energy facilities. We used a null model to represent no effect of the disturbance on grouse nest survival and we created a linear model with a distance (x) effect (logit(S w )  b 0  b 1 x). We then created three models to detect potential thresholds: a discrete distance effect model based on two categories of distance (near/far; logit(S w )  b 0  b 1 z, where z  1 for nests beyond the break point ('far') and z  0 for 'near' nests), an interaction effect of linear distance (x) and distance category (z, as before: . Models were proposed to align statistical pattern with biological process. We hypothesized that the discrete model would best describe a stair step threshold, and the cubic or interaction model would best describe the ramped threshold (Table 1).
The use of discrete and interaction models required that we propose a break point to classify nests as 'near' to or 'far' from the wind energy facility. We compared two methods to accomplish this task: 1) visual assessment of raw data, and 2) a priori model comparisons. Both methods were attempts to mimic a real-life situation faced by a biologist with empirical samples, so that our inferences would be applicable to real situations. First, we used a simple, visual assessment of summaries of the raw proportions of nests surviving in each 1-km distance interval to attempt to discern patterns that might suggest a break point. LAP performed the simulations and provided a summary (sensu Fig. 3) of the raw data in blind fashion to MBB who provided a best approximation of the location of a threshold (Fig. 3). We then used these break points to create a near/far covariate for each nest (1  far, 0  near) in our data sets. Second, we used break points of 1, 2, 3, …, 8 km to construct competing models. The topranked model was moved forward to represent the threshold distance for the discrete or interaction models in the final analysis (Table 2, 3; Buckland et al. 2001).
Each of the eight scenarios (four response patterns  two sample sizes) was analyzed using a known fate analysis with two covariates (linear distance to turbine and near/far distance category) in Program MARK (White and Burnham although there was no evidence provided to defend the alternative hypothesis that a smaller (e.g., 1, 1.5, 2 or 2.5 km) or slightly larger disturbance effect (e.g. 3.5 or 4 km) was responsible for the variation observed in the movement patterns.
To address the gap in our ability to properly assess these distance thresholds, we developed a framework of patterns to describe biological processes relevant to our case study, nesting grouse, along a gradient from a disturbance. Our objectives were to 1) determine if an appropriate nonlinear model would be selected to describe the threshold inherent in respective sets of simulated data and 2) investigate the potential for spatial patterns to be detected with small sample sizes. We used our results to provide recommendations for future studies to enhance our ability to predict the distance of spatial threshold responses.

Response frameworks and data simulation
We developed four patterns to describe the potential effect of a disturbance on the nest survival of grouse along a gradient from a disturbance. Two patterns were simple and without a threshold: a null response (no effect of distance; Fig. 3A) and a linear response along the gradient (Fig. 3D). Two other patterns incorporated more complex types of thresholds: a discontinuous, stair step response (Fig. 3G), and a ramped threshold (Fig. 3J). In an ecological context, one might expect some types of pollutant disturbance to show a linear response in the ecosystem as the chemical dissipates in air or dilutes in water (Wear and Tanner 2007). A ramped threshold might mimic the reduction in anthropogenic sound along the ground as the acoustic energy dissipates upwards and outwards (Blickley et al. 2012, Whalen 2015. A discontinuous response might be expected in the context of a human village and associated patterns of use of nearby land (Dembélé et al. 2006), or a visual disturbance that stimulates avoidance behavior (Pruett et al. 2009).
We simulated (SAS/IML; SAS ver. 9.22) a sample of grouse nests (n  500 or n  50) along a 25 000-m (25km) disturbance gradient from a hypothetical wind energy facility. Nests were randomly assigned a distance from the facility, and the distance was used to model the nest's daily and weekly probability of nest survival under one of the four scenarios. Under the null response, all nests had a daily survival probability of 0.98 (Matthews et al. 2013), and a weekly survival probability of S w  0.98 7  0.8681 (Fig.  3A). Under the linear response, daily survival ranged from 0.94 (S w  0.6484; ∼5% decrease in daily survival, ∼25% decrease in weekly survival probability) at the origin to 0.98 at 25 000 m, and the weekly survival (S w ) was assigned to nests using the distance (x) from the wind energy facility as S w  [0.94  (0.0000016x)] 7 (Fig. 3D). Nests in the stair step response were assigned a daily survival of 0.98 for distances  5000 m (S w  0.98 7  0.8681), and 0.93 for distances  5000 m (S w  0.93 7  0.6017; Fig. 3G). We chose 5000 m as the threshold to mimic a local effect similar to effects anticipated for grouse studies in the context of energy development (US Fish and Wildlife Service 2012). Finally, not differ from the underlying probability of S w  0.8681 used to simulate nest survival, although the precision was lower for small sample sizes (n  50 nests: Sw   0.8795, SE  0.0253, 95% confidence interval: 0.8206-0.9209; n  500 nests: Sw   0.8537, SE  0.0089; 95% confidence interval: 0.8355-0.8703). Similarly, weekly survival probability in the discrete model was similar to the stair step process used to simulate the data for n  500 nests ( 5000 m: S w  0.8681, Sw   0.9028;  5000 m: S w  0.6017, Sw   0.6250) and n  50 nests ( 5000 m: S w  0.8681, Sw   0.8714;  5000 m: S w  0.6017, Sw   0.5424). 1999). We used Akaike's information criterion (AIC c ) corrected for small sample size (Burnham and Anderson 2002) to determine which of the five models best described the variation in nest survival. We assessed model support using model ranks (ΔAIC c ) and weights (wAIC c ).

Results
Weekly survival probability, as estimated by the constant (null) model for each of the constant survival scenarios did  (B) and (C) that survived during four-week simulations within each 1-km distance interval for two sample size scenarios, n  50 nests (B, E, H and K) or 500 nests (C, F, I and L). These data summaries were used in blind fashion by an observer to visually determine where a break point, or threshold (dotted, vertical lines), might exist. For (B), (E) and (F), the observer suggested the existence of a threshold even though a threshold did not exist in reality; for (C), the observer correctly claimed no threshold existed. Once thresholds were determined, the distance was used to create discrete and interaction models for analysis purposes. . was high (n  500 nests), and thresholds were identified when present (Table 1, 3). At lower sample sizes (n  50 nests) that may be typical of short-term empirical sampling schemes, the stair step threshold was detected (wAIC c  0.64), but the more complex, ramped threshold was not detected when using visual assessment to propose break points (Table 1, Fig. 4). Instead, the null model was selected (wAIC c  0.53); the erroneous interpretation (a type II error) of the assessment would be that weekly probability of nest survival was not affected by the disturbance. There were no type I errors when using visual assessment to propose break points; thresholds were never inferred when they did not exist (null or linear models, Table 1). In analyses resulting from a priori model selection to choose break points, the proper response pattern was detected when samples sizes of nests were high. At low samples sizes, the discrete model was selected to represent variation in data simulated under a linear model, which had no discrete threshold (type I error). The discrete model was again selected for the ramped threshold scenario (Table 3) in preference to the proper, but more complex, cubic or interaction models.
When the sample size was large (n  500 nests), the raw proportions of nests that survived in each distance category gave a good approximation of the underlying pattern in survival used to generate the data (Fig. 3). However, at n  50 nests, the random distribution of nests resulted in some distance categories without nests, and the pattern in raw survival of nests did not closely match the underlying patterns. The stair step and ramped simulation models were structured with break points at 5 km, and our blind observer selected likely break points (n  50 nests: 6 km; n  500 nests: 4 km) that were close to truth for the stair step model. The a priori model selection process selected the proper break points (5 km) in both sample size scenarios (Table 2). Our blind observer had more problems with assessments of break points in data from the ramped simulations (n  50 nests: 9 km break point; n  500 nests: 2 km; Fig. 3, 4). The a priori model selection process also put forward models with improper break points for the ramped simulations (n  50 nests: 2 km; n  500 nests: 6 km).
Each of the underlying response patterns was detected with an appropriate model in a model selection framework (wAIC c  0.61-0.75) when the sample size of nests Table 1. Competing weekly nest survival models of simulated grouse nests along a gradient from a wind energy facility under two sample size scenarios (n  50 or 500 nests) and four underlying patterns of survival. The top two models are shown for the eight analyses with threshold values estimated by visual inspection by MBB. Models are ranked by Akaike's information criterion adjusted for small sample size (AIC c ). ΔAIC c is the difference of each model's AIC c value from that of the highest ranked model, and wAIC c is the Akaike weight.  interaction and cubic models to describe trends in demographic parameters along gradients away from a disturbance. These models describe threshold responses, and the models are simple in structure and easy to implement. The discrete and interaction models require the declaration of a break point, and this determination needs to be based on biological reasoning if a threshold is to be ecologically

Spatial disturbance response models
The models we used in our analyses were generally supported when appropriate, and the process in the simulated data was described. We encourage biologists to consider discrete, Table 3. Competing weekly nest survival models of simulated grouse nests along a gradient from a wind energy facility under two sample size scenarios (n  50 or 500 nests) and four underlying patterns of survival. The top two models are shown for the eight analyses with threshold distances derived through a priori model selection for discontinuous functions (Table 2). Models are ranked by Akaike's information criterion adjusted for small sample size (AIC c ). ΔAIC c is the difference of each model's AIC c value from that of the highest ranked model, and wAIC c is the Akaike weight. a Models under consideration included constant (null), linear, discrete (near/far), interaction, cubic. Discrete and interaction models used were the result of a priori model comparisons to determine break points. b Interaction function included interaction effect of linear distance and a discrete, near/far distance category. Figure 4. Comparisons of predicted estimates of weekly survival from five alternative models (constant, linear, discrete, cubic polynomial, and interaction) with the ramped pattern (A and B) of survival used to simulate weekly nest survival data under two sample size scenarios (n  50 or 500) with visual assessment used to produce break points in discontinuous models. Nests were randomly placed along a gradient of distances from a disturbance. Predicted survival from the top-ranked models are shown on top (A and B) with the ramped, simulated data pattern. Predicted survival from models with no support are shown on the bottom (C and D). See Table 1 for details on model comparisons. slight, yet critical, misinterpretation of Leddy et al.'s (1999) study design, stating: "Densities of grassland birds measured at transects in reference fields and at transects at least 180 m from turbines [emphasis ours] were four times greater than in portions of study plots located near turbines". In fact, Leddy et al. (1999) had no transects  180 m from turbines, and the threshold insinuated by Erickson et al. (2007) through the use of the phrase "at least" was never claimed by Leddy et al. (1999). If the objective of a study is to produce recommendations that include the potential to show a response threshold, an IGD should be used with samples taken at the appropriate scale along the gradient to allow the assessment of a point of change.
Our simulation model uses a 25-km gradient, which was similar in scale to the gradient used by others to study effects of wind turbines on greater prairie-chickens (Harrison 2015, Whalen 2015, Winder et al. 2015. The use of a long gradient, relative to anticipated effect distance of the disturbance assures a spatial control region that would be assumed to have no effect. However, as we demonstrated with our simulation scenario at n  50 nests, a long gradient can also spread a small sample thinly and possibly obscure the underlying effect. Thus, longer gradients require attention to sample size at an appropriate scale throughout the gradient to provide data to inform the shape of the response curve. Alternatively, the use of a shorter gradient increases the potential for a simple linear model to describe the local effect of the disturbance, but the gradient may not reach into areas not affected by the disturbance. Harrison (2015) and Winder et al. (2015) used a secondary focal analysis of a smaller portion of the gradient for a more detailed analysis, which may be beneficial to further define the shape of the effect.
Last, we propose that a priori predictions, or hypotheses, should guide the length of the gradient used in the study. Manville (2004) supposed an 8-km disturbance effect for prairie grouse, so a gradient study should be at least twice that long to test for the exact distance of a break point. Our assessment of sample size confirms the need to include sufficient samples along the gradient and especially in the proximity of the hypothesized break point to provide statistical resolution. Grouse leks and nests cannot be experimentally manipulated in space, and these sites are not always spaced perfectly along gradients (Whalen 2015, Winder et al. 2015, which may affect the potential to describe ecological thresholds. Our assessment, for simplicity of mission, was based on the impact gradient design in the context of exploration of the effects of a hypothetical, existing wind energy facility. When logistics allow, we encourage the use of the beforeafter-gradient (BAG) design. Most critically, we encourage the use of the entire data set in an analysis similar to those we provided; entrepreneurial focus on a subset of the data (Pitman et al. 2005, Hagen et al. 2011) would seem to sacrifice the advantage of the study design and statistical power to find a threshold response that is the objective of the study.

From ecology to policy
We acknowledge the difference between a statistical threshold (defined as the point at which evidence for an meaningful. We used a visual assessment to inform the selection of a break point by viewing our summaries of our data a posteriori, similar to Zuur et al.'s (2010) encouragement to visually inspect data in initial stages of analysis. We found our visual assessment to provide comparable results to the a priori model selection process (Buckland et al. 2001); visual assessment avoided type I errors (Table 1), but the a priori model selection generally found the proper break points when they existed (Table 2, 3). Holloran et al. (2010) used a different approach that seemed well-justified to select the break point of a discrete model for annual survival of female sage-grouse by assessing the spatial distribution of neststo search for a signal that might inform the 'reach' of the disturbance. Whalen (2015) used background sound levels at greater prairie-chicken leks to determine which leks had potential to be influenced by noise from wind turbines to inform the development of a discrete model to describe effects of wind turbines on male breeding vocalizations. We encourage similar rigorous thought as discrete models are developed for future studies.
Our analyses provide evidence that various types of nonlinear models have potential to help develop thresholds for siting guidelines. However, we did not consider one class of nonlinear models, general additive models (GAMs), in our assessment. McNew et al. (2014) used a GAM approach, although distance to nearest turbine was not found to affect nest habitat selection of greater prairie-chickens. Winder et al. (2015) also used a GAM analysis to assess the numbers of males on leks proximal to a wind energy facility. We acknowledge that GAMs provide flexible, fine-scale descriptions of nonlinear responses (Post van der Burg et al. 2010), so we do not discourage their use. In fact, if proper algorithms are used to guide the smoothing parameters used, the GAM may produce a nicely fitted response that reveals break points (although the exact placement of the threshold must be estimated visually; Francesco Ficetola and Denoël 2009) without a priori considerations of response shape. But, our experience suggests that AIC values for discrete models (used in a GLM framework) cannot be compared with AIC values taken from GAM frameworks (Harrison 2015); thus complete discontinuities cannot be assessed with GAMs. Further, the models we used in our analysis have the advantage that they can be constructed in many software platforms that do not provide functionality in GAMs (e.g. program MARK). The simpler models we propose should fit most biological processes that would cause a threshold response, while avoiding the risks of overfitting a GAM (Winder et al. 2015).

Study design considerations
Certainly, study design is critical to provide data that can be assessed for threshold responses. As policy makers and biologists review literature, it is critical they utilize information that is available from a published study and refrain from using the data beyond its original intent. For example, Leddy et al. (1999) placed survey transects at three distances away from turbines to assess a potential effect of wind energy facilities on abundance of grassland birds. The study was not designed to establish a specific threshold, and the authors made no such claims. However, Erickson et al. (2007) provided a ecological threshold is of the strength to reject a null hypothesis), an ecological threshold (the presence of a change in state of a system), and a policy threshold (a distance-based policy established to protect a species of concern). As an example, Manville (2004) provided justification for an 8-km buffer (a policy threshold) around critical habitat at a time when no direct information was available to confirm the presence of ecological thresholds. Manville (2004) reviewed a large body of literature with strong evidence for local effects of energy development on grouse species of concern, but not a single study cited by Manville (2004) had published a pattern from a gradient-based study design that could be statistically or visually assessed to determine a point of reflection in the response of grouse to energy disturbance. In this context, perhaps the most insightful phrase to support the policy statement found in Manville (2004) is a reference to a statement made by C. Aldridge (Colorado State University, USA): He indicated that it was in "everybody's best interest to err on the safe side". Currently, two of the four grouse species in the Great Plains, USA (greater sage-grouse and lesser prairie-chicken) are the focus of evaluations to determine if their federal conservation status should be raised to threatened or endangered. Of course, such a status carries with it enforceable ramifications to the economic activities of private landowners, and biologists can be asked to defend their policy suggestions in legal proceedings. Thus, we suggest it is best if policy thresholds are based on defendable statistical evidence of ecological thresholds (Schultz 2008).
We note, however, that Field et al. (2004) provided guidance for situations in which the cost of failing to detect a real effect on a species of concern (a type II error, as exemplified by our small-sample assessment of a ramped threshold scenario, Table 1) is high. In this scenario, Field et al. (2004) suggest that the statistical "burden of proof " should be lowered to avoid damage to the ecosystem. Hence, the call to "err on the safe side" may be a defendable policy position, although we emphasize that such a position is not a justification to avoid defendable analyses and robustly designed studies of disturbance gradients.

Conclusions
Policy makers need information on ecological thresholds with regard to energy development in grasslands to protect species of grouse that are of conservation concern. Our review of trends in the literature suggests that initial studies provided evidence of local effects but were not designed to test hypotheses regarding ecological thresholds. Our results are applicable to other disturbance scenarios beyond the case of wind energy development, grouse, and nest survival that we present for context. Future studies should strongly consider use of before-after-gradient or impact-gradientdesign study designs to evaluate effects in the spatial context of the disturbance. Our analyses show that simple discrete, interaction, and cubic polynomial models can be useful to detect nonlinear patterns in demographic rates along a gradient. We encourage researchers to be cautious about the manner in which threshold responses are assessed and described.