Monitoring animal populations can be difficult. Limited resources often force monitoring programs to rely on unadjusted or smoothed counts as an index of abundance. Smoothing counts is commonly done using a moving-average estimator to dampen sampling variation. These indices are commonly used to inform management decisions, although their reliability is often unknown. We outline a process to evaluate the biological plausibility of annual changes in population counts and indices from a typical monitoring scenario and compare results with a hierarchical Bayesian time series (HBTS) model. We evaluated spring and fall counts, fall indices, and model-based predictions for the Rocky Mountain population (RMP) of Sandhill Cranes (*Antigone canadensis*) by integrating juvenile recruitment, harvest, and survival into a stochastic stage-based population model. We used simulation to evaluate population indices from the HBTS model and the commonly used 3-yr moving average estimator. We found counts of the RMP to exhibit biologically unrealistic annual change, while the fall population index was largely biologically realistic. HBTS model predictions suggested that the RMP changed little over 31 yr of monitoring, but the pattern depended on assumptions about the observational process. The HBTS model fall population predictions were biologically plausible if observed crane harvest mortality was compensatory up to natural mortality, as empirical evidence suggests. Simulations indicated that the predicted mean of the HBTS model was generally a more reliable estimate of the true population than population indices derived using a moving 3-yr average estimator. Practitioners could gain considerable advantages from modeling population counts using a hierarchical Bayesian autoregressive approach. Advantages would include: (1) obtaining measures of uncertainty; (2) incorporating direct knowledge of the observational and population processes; (3) accommodating missing years of data; and (4) forecasting population size.

## INTRODUCTION

A central topic in applied ecology is an understanding of population abundance to inform the conservation and management of species (Mills 2013, Nichols 2014). Foundational laws for the conservation of species (e.g., Australia: Environment Protection and Biodiversity Conservation Act 1999; European Union: Habitats and Birds Directive of 1992 and 2009; United States: Endangered Species Act 1973), as well as criteria used by international conservation agencies (e.g., International Union for Conservation of Nature), often reference the population status of species, leading to the common use of abundance and distribution as performance measures in recovery plans and delisting rules (Mace et al. 2008, Neel et al. 2012). Monitoring programs often focus on population status or population dynamics and the estimated effect of management decisions (Nichols 2014).

Knowledge of population size is especially important for evaluating decisions about anthropogenic take, either intentional (e.g., sport harvest, control of overabundant species) or incidental (e.g., fishery bycatch, wind turbine strikes). However, population monitoring is fraught with challenges (McComb et al. 2010). Gaining reliable estimates of population size requires accounting for 2 aspects of sampling: the spatial extent of a population (either through exhaustive coverage or through an appropriate sampling design, e.g., probabilistic design; Thompson 2002), and imperfect detection and/or availability (Pollock et al. 2002, Nichols and Williams 2006, Gorresen et al. 2016, Rivera-Milán et al. 2016). However, most surveys result in only an index of population size, which is typically biased low due to a lack of appropriate spatial coverage or adjustment for detection and/or availability. In addition, many of these surveys involve unadjusted counts that lack a means of assessing precision.

Due to the logistical and financial challenges of surveying most species, it is perhaps rare for long-term monitoring programs to obtain estimates of true population size (Pollock et al. 2002, Johnson 2008). Instead, monitoring programs often rely on counts or some function of those counts as an index of the population. Typically, indices are standardized periodic counts (e.g., roadside bird point counts), which often cannot be converted to an estimate of population size (e.g., because the survey area is not representative of the full range of the population), or are derived from surveys that attempt to count or account for the entire population, but are suspected to be incomplete. Potential influencing factors include: availability bias (see Martin et al. 2015), wherein a proportion of the target population could be unavailable for counting (e.g., due to variable migration timing); visibility bias (see Smith 1995), in which a subset of individuals present during a survey are not detected by observers; and counting bias (see Benning et al. 1997), wherein an observer miscounts or approximates the number of individuals in large aggregations. The first 2 sources of bias are inherently negative, whereas counting bias can be negative or positive.

A primary consideration when using an index as a surrogate for abundance is its relationship with true abundance; an index will be most useful when there is a constant and proportional (i.e. linear) relationship (Johnson 2008). If the relationship is nearly constant, the index could be suitable to investigate population dynamics. However, this assumption may often be inappropriate due to availability, detectability, or countability, or their variance, across space or time (Link and Sauer 1997, Anderson 2001, Pollock et al. 2002). Survey protocol standardization minimizes sampling variability, which is always a good idea, but is unlikely to eliminate the variation beyond the control of the researcher; thus, sampling variation will still be of concern (Thompson 2002).

Avian studies overwhelming rely on indices to make inferences about populations (Rosenstock et al. 2002). For harvested migratory bird populations in the United States, there is often an operational taxon-focused monitoring program that directly informs annual harvest decisions using population indices, such as for American Woodcock (*Scolopax minor*; Cooper and Rau 2014), Mourning Dove (*Zenaida macroura*; Seamans and Sanders 2014), and Band-tailed Pigeon (*Patagioenas fasciata*; Sanders 2014). Some monitoring programs attempt to estimate abundance, but recognize that, because the spatial extent of sampling is not truly exhaustive or because observations are not perfect, counts will reflect an index to the population; a few examples include monitoring of Tundra Swans (*Cygnus columbianus*; Pacific Flyway Council 2001), ducks (U.S. Fish and Wildlife Service 2014b), and Sandhill Cranes (*Antigone canadensis*; Kruse et al. 2014).

A common population index smooths sequential counts via a moving 3-yr average (MTYA) estimator:

where*C*is the raw count (with no basis for estimating sampling variance) in the most current year. This estimator is often applied to monitoring migratory birds (e.g., several species of geese: U.S. Fish and Wildlife Service 2014b; Tundra Swans: Pacific Flyway Council 2001; Wood Storks [

_{T}*Mycteria americana*]: U.S. Fish and Wildlife Service 2014a; and Sandhill Cranes: Kruse et al. 2014), but is also more generally used in wildlife monitoring when relating the status of a population to an objective, such as with sea otter (

*Enhydra lutris*) recovery (U.S. Fish and Wildlife Service 2003), Utah prairie dog (

*Cynomys parvidens*) recovery (Utah Division of Wildlife Resources 2015), and furbearer harvest management (Lovallo and Hardisky 2010). Despite the frequent use of the MTYA population index across taxa, to our knowledge there has yet to be a critical evaluation of its utility or a comparison with alternative model-based approaches.

Migratory populations of Sandhill Cranes are challenging to monitor and are annually harvested. The management strategy for the Rocky Mountain population (RMP) of Greater Sandhill Cranes (*A. c. tabida*) includes an annual population survey, which has been ongoing since 1984. The survey represents an attempt to find all cranes and count them; it includes an aerial portion over a set collection of known crane staging areas, combined with ground counts in other parts of the migration corridor to account for cranes that have migrated early. Counts are not corrected for visibility or counting biases. The survey produces a population count, in which the MTYA population index (referred to below as the ‘population index') is used to inform the allocation of allowable harvest in each year (Subcommittee on Rocky Mountain Greater Sandhill Cranes 2007). As with many monitoring programs, the reliability of these counts and resultant index is unknown. Since no additional data are collected to allow for adjustments for availability, visibility, or countability, there is no basis to adjust for biases or to estimate sampling variability from the data collected in a given year. We used the 31 yr of counts of the RMP as an example to outline a thorough process to critically evaluate such population counts and indices. We explored the general reliability of using the MTYA estimator for deriving a population index and compared this estimator with a formal statistical modeling approach that explicitly partitioned the observational and population processes; we did so empirically and via simulation.

Our specific objectives were to evaluate to what extent annual changes in the spring migratory (1984–1996) and fall staging area (1997–2014) RMP Sandhill Crane counts (referred to below as ‘counts') and fall staging area population index were biologically realistic. We did so by integrating the counts with available data on RMP juvenile recruitment, harvest, and survival into stochastic stage-based population models. We used these models in a Bayesian framework to estimate values of annual sampling variation due to latent observational processes involving availability, visibility, or countability. Finally, we used a hierarchical Bayesian time series (HBTS) modeling approach to estimate RMP population abundance under different types of observational processes and prior information. We considered the biological plausibility of the predicted fall population from the HBTS model and compared the HBTS model with the MTYA estimator using simulations. Our goal was to attain a better understanding of the realized (i.e. biological and sampling) components of population counts that directly inform important management decisions, such as harvest levels and population recovery. We hope that our methodology will encourage more monitoring programs to critically evaluate their population indices.

## METHODS

### Rocky Mountain Population

The RMP of Greater Sandhill Cranes breeds in palustrine and riparian wetlands throughout the central Rocky Mountains, including in Colorado, Wyoming, Utah, Montana, and Idaho, USA (Drewien and Bizeau 1974, Gerber et al. 2014). During spring and fall migration, cranes stop over in the San Luis Valley in central-southern Colorado and winter throughout New Mexico and Arizona, USA, and Mexico, but are primarily located in the middle Rio Grande Valley of New Mexico. Legal hunting of the RMP began in 1981. To help inform managers about harvest decisions, a population survey was started in 1984, in which an aerial spring count (via transects) was conducted in the San Luis Valley. The count was adjusted for counting bias (via photographic correction) and the proportion of Lesser Sandhill Cranes (*A. c. canadensis*) of the Mid-Continent population, which also stops over in the San Luis Valley (Benning et al. 1997, Kruse et al. 2014). However, because of sampling biases due to imperfect availability, the presence of an increasing number of Mid-Continent cranes over the years, and questions about the reliability of assigning cranes to subspecies and thus population via track measurements, the spring survey was discontinued in 1996. Its replacement was a premigratory fall staging area survey, which became fully operational in 1997 and has continued since (Kruse et al. 2014).

The fall survey is coordinated across federal and state agencies and includes aerial and ground counts at 81 known staging sites throughout the breeding area states. There is no additional information collected to allow adjustment of the fall counts due to sampling variation. However, it is recognized that a ‘poor' count can occur (Subcommittee on Rocky Mountain Greater Sandhill Cranes 2007), for example, due to survey conditions that impede conducting the survey; in these cases, the annual count may not be subsequently used to determine harvest allocation. The second cause of a ‘poor' count is sampling variation caused by annual changes in the availability of cranes to be counted, detection probability, and counting biases, such as overcounting a flock or counting the same flock multiple times because of movement. These variations could ultimately lead to under- or over-counting the population. The MTYA estimator of the most recent and reliable counts (i.e. not ‘poor' due to survey conditions) is used to reduce this variation and derive the population index, which is then used for allocating allowable RMP crane harvest in the following year (Subcommittee on Rocky Mountain Greater Sandhill Cranes 2007). The current population objective of the RMP is to maintain a population index between 17,000 and 21,000 (Subcommittee on Rocky Mountain Greater Sandhill Cranes 2007).

### Modeling Overview

We present our modeling in 4 main sections. The first section describes a process to evaluate the biological plausibility of the annual changes between RMP counts and the population index. We do this by comparing observed counts and the index with predictions of counts or the index using a population dynamics model. We include predictions from 2 extremes, one in which no cranes die between years and one in which an unrealistically high proportion of the population dies. These extreme predictions can be considered boundaries of what is plausible, but not necessarily realistic. We also include realistic predictions using empirical survival estimates; when observed counts correspond to realistic predictions, we can have some confidence that the annual change between sequential counts or the index represents real population dynamics. The second modeling section assumes that all demographic processes are known exactly, allowing us to model RMP counts and estimate annual sampling variation. Assuming that we identify an appropriate population model, estimates represent the observational process that gave rise to our data. The third section outlines an entirely different population modeling framework; we use a hierarchical Bayesian time series model (HBTS) to fit the RMP count data, to simultaneously account for both observational and population process variation. We consider models in which the observed RMP counts are assumed to vary symmetrically around the true population mean and in which counts are always less than the true population (e.g., due to counting, visibility, and availability biases being strictly negative). We evaluate these models by comparing them with projections from the realistic population model of section one. The fourth section outlines a simulation study to evaluate the conditions (e.g., population decline or increase) under which the HBTS model estimates or the MTYA estimator better captures the true population mean and overall trend of the population trajectory.

### Unrealistic Annual Variation in Population Counts and the Fall Index?

We evaluated whether the annual change in the RMP counts and fall population index were biologically plausible by developing stochastic stage-based population models using the spring and fall time series of counts, separately. Our aim was to develop models that were constructed using available data and did not require assumptions about age classes for which we did not have empirical estimates (see Gerber and Kendall 2016). We used the observed count data in year *t* − 1 (*C _{t}*

_{−1}) to predict 3 empirical distributions of the count in year

*t*(“low,” “realistic,” and “high”). Each model incorporated data on annual juvenile recruitment (<1 yr old), juvenile and adult survival, and harvest. Juvenile recruitment (i.e. the proportion of juveniles in the population) has been estimated since 1972 from an annual survey in the fall at the San Luis Valley (Gerber et al. 2015); the survey covers all areas of the San Luis Valley that cranes are known to use, and entails substantial survey effort, with >4,000 annual observations of cranes made using a systematic design. Harvest is estimated each year from hunter surveys (Kruse et al. 2014). We considered different survival parameters for each scenario: (1) unrealistically low survival, with harvest mortality being completely additive (low); (2) empirical survival from a 23-yr mark–resight study and harvest that was compensatory up to natural mortality (R. C. Drewien personal communication; realistic); and (3) an upper maximum survival wherein no death occurred (high).

We investigated the biological plausibility of annual change in the counts by estimating the absolute difference between the realistic expected count and the observed *C _{t}* and whether

*C*fell within the realistic empirical distribution or between the low and high distributions. When an observed count was either less than the low empirical distribution or greater than the high empirical distribution, we suggest that the change in the count was unlikely to represent true population change. If the observed count fell between these distributions it was plausible, but not realistic unless it also fell within the realistic empirical distribution. Realistic changes indicate a constant proportional relationship to abundance, whereas unrealistic changes suggest temporal variability in the observational process. We constructed models in R (R Core Team 2015) and simulated predicted counts from each model 50,000 times for each sequential comparison to obtain empirical distributions (low, realistic, high).

_{t}### Spring Population Model

For each sequential comparison, we observed a spring (*SP*), adult (*A*) population count in year *t* − 1 (*C _{SP,A,t}*

_{−1}) that we considered had been observed perfectly. These adults survived with some probability () to the fall (

*F*,

*N*

_{F,A,t}_{−1}). During the fall, we observed the proportion of juveniles in the population [

*P*

_{t}_{−1}=

*N*

_{F,J,t}_{−1}/ (

*N*

_{F,J,t}_{−1}+

*N*

_{F,A,t}_{−1})] that we used to derive the juvenile population (

*N*

_{F,J,t}_{−1}). The spring population in year

*t*was a combination of adults and juveniles (

*J*) in the fall (

*N*

_{F,A,t}_{−1}and

*N*

_{F,J,t}_{−1}, respectively) that survived to spring with some probability ( and , respectively, and a portion of which was removed by harvest [

*f*(

*H*

_{t}_{−1})], where

*f*() is a function to indicate additive or compensatory mortality, depending on the scenario). We considered that juveniles became adults after the fall because they are no longer distinguishable in the following year. Lastly, we assumed no counting error to then predict the spring population count in year

*t*(

*C*). The only parameters that varied stochastically were survival probabilities, which we defined as beta distributed and time invariant (, ∼ Beta(α

_{SP,A,t}_{1},β

_{1}), ∼ Beta(α

_{2},β

_{2})). The complete spring stochastic population model was structured as follows (Supplemental Material Figure S1):

*P*;

_{t}For the high scenario, we assumed that all survival parameters were exactly 1 and no harvest occurred. For the realistic scenario, we used an annual adult survival probability of 0.956 with a process variance of 0.025 and a 6-mo juvenile survival probability of 0.848 with a process variance of 0.073. We rescaled the adult survival to two 6-mo periods and estimated the new variance using the delta method. We then used moment matching to obtain appropriate parameters for the beta distribution (e.g., ∼ Beta(α_{2},β_{2}))). Harvest was considered to be compensatory up to natural mortality, for which there is empirical evidence for most age classes (R. C. Drewien personal communication). For the low scenario, we used a mean adult and juvenile survival for spring and fall of 0.90 and 0.70, respectively, and harvest was additive to natural mortality; the process variances from the realistic scenario were used.

### Fall Population Model

The fall count included both the adults and juveniles (*C _{F,AJ,t}*

_{−1}) on the premigratory staging grounds. To make an age-structured model by incorporating juvenile recruitment, we had to assume that the recruitment (juvenile) and fall count surveys occurred simultaneously. In actuality, there was, on average, a 1-mo difference in the timing of these surveys. This may have caused bias if the differential survival between age classes from the fall staging grounds to the San Luis Valley was largely different than for other months. We again assumed that the population was counted perfectly. Here, we derived the adult and juvenile population in the fall using

*P*

_{t}_{−1}. These fall adult and juvenile populations survived to the next year with some probability ( and , respectively), with some being harvested. To add the new juveniles of year

*t*, we used the recruitment survey again (

*P*) and, lastly, assumed perfect detection to predict the observed count in the year

_{t}*t*(

*C*). The complete fall stochastic population model was structured as follows (Supplemental Material Figure S2):

_{F,AJ,t}Because we added juveniles into the predicted fall count based on *N _{F,A,t}*, rather than

*N*, this will have slightly underestimated the population of juveniles in the fall. We could have used

_{F,AJ,t}*C*as a substitute for

_{F,AJ,t}*N*, but this could have increased bias with uncertain direction. To investigate the biological reliability of annual changes in the fall population index (i.e. MTYA), we substituted counts with the fall population index.

_{F,AJ,t}### Deriving Annual Observational Variation

To estimate the sampling variation in annual counts, we used the spring- and fall-based population models, assuming the same survival as the realistic scenario and uncertainty in the observation process (θ* _{t}*) such that population size did not equal the counts. For the spring model,

*N*

_{SP}_{,}

_{A}_{,}

_{t}_{–1}∼ Poisson and

*C*

_{SP}_{,}

_{A}_{,}

*∼ Poisson(*

_{t}*N*

_{SP}_{,}

_{A}_{,}

*× θ*

_{t}*), while for the fall model,*

_{t}*N*

_{F}_{,}

_{AJ}_{,}

_{t}_{–1}∼ Poisson and

*C*

_{F}_{,}

_{AJ}_{,}

*∼ Poisson(*

_{t}*N*

_{F}_{,}

_{AJ}_{,}

*× θ*

_{t}*). The only unknown parameters related to the annual observation process, for which we used relatively informative prior distributions of θ*

_{t}*∼ Unifom(0.5, 1.5). The joint posterior distribution of the population model was defined as (square brackets indicate a probability distribution):*

_{t}However, the model was unidentifiable without more information as to the certainty of the data or stronger prior information. To remedy this, we fully defined the observation process for the first count of each time series, assuming a distribution with a standard deviation of 0.1 and a mean of either 0.8, 1.0, or 1.2. We fit the model using Markov chain Monte Carlo (MCMC) by sampling from full-conditional distributions using R (R Core Team 2015); 100,000 MCMC samples were used with a burn-in of 10,000 samples. Posterior convergence was assessed visually.

### Hierarchical Bayesian Time Series Population Model

Fitting a population model without direct empirical information on components of the observational process (i.e. availability, visibility, and counting bias) is challenging. We considered a hierarchical population model that included first-order autoregressive Gompertz population growth, which separated population process and observational variability into distinct components, but pooled components of the observational process. Hierarchical models that partition processes explicitly into separate state-spaces are growing in their development and application in applied ecology (e.g., Dennis et al. 2006, Newman et al. 2006). Our model parameters included an intrinsic population growth rate (β_{0}), a first-order autoregressive term that can be interpreted as a measure of density dependence on population growth (β_{1}), process variance (), and observational uncertainty (). This type of observational process allows for symmetrical under- and over-counting around the true population (“symmetrical”). Negative density dependence would be indicated if β_{1} < −1; thus, the previous year's abundance would negatively affect current abundance. We fit this model to spring and fall counts, separately. Parameters defined using subscripted periods (e.g., β_{.}) indicate the same distributional structure for all such parameters:

We also considered a similar model by including relatively strong prior information on the observational and population process variances (fitting spring counts: σ* _{C}* ∼ Uniform(0.0, 0.4), σ

*∼ Uniform(0.00, 0.05); fitting fall counts: σ*

_{N}*∼ Uniform(0.0, 0.2), σ*

_{C}*∼ Uniform(0.00, 0.05)). Prior distributions for the population process were fairly small, to indicate limited variability in crane dynamics, and prior observational processes were fairly large, to suggest the likelihood of high variation (Supplemental Material Figure S3); the uniform distribution was preferred for informative priors because we could place constant probability support over a range of likely values.*

_{N}Given that availability and visibility biases are negative, and counting bias tends to be negative, it is also reasonable to assume that counts occurred only at or below the true population. We considered this type of model by defining the observational process such that counts were below the lower bound of the true population process (“under”; via a truncated normal distribution). We fit this model to the fall counts with relatively uninformative prior distributions (as shown above), as well as using informative prior distributions on the variance parameters (σ* _{C}* ∼ Uniform(0.0, 0.2) and σ

*∼ Uniform(0.00, 0.05)).*

_{N}We evaluated the predicted population from the HBTS model using the fall population model. We initialized the fall population model using the posterior distribution of the predicted counts from the first year and projected the model with the survival parameters from the realistic scenario. We considered harvest within the fall population model to be either completely additive to natural mortality or compensatory up to natural mortality.

We fit all models using the R package rjags (Plummer 2013), which interfaces with software JAGS (Just Another Gibbs Sampler; http://mcmc-jags.sourceforge.net/), in which MCMC is used to simulate samples from the full conditional distributions of unknown parameters of the statistical model. We initialized each model and obtained 1,000,000 MCMC samples, discarding the first 100,000 and thinning to every 10^{th} iteration. Each model was checked as to its adequacy for fitting the data using a posterior predictive check by calculating a Bayesian *P*-value (Gelman et al. 2004); values that are not extremely low (near 0) or large (near 1) suggest that the model is able to give rise to new observations that resemble the original data. Residual temporal dependence was also checked by examining partial and full autocorrelations of model residuals (Box et al. 1994).

### Evaluating the HBTS model vs. the MTYA estimator

We investigated the reliability of estimating population size using the HBTS model compared with the MTYA estimator. We simulated a discrete-time exponential growth model that incorporated the estimated fall population process variance (via posterior distribution) and either considered the observational process to be symmetrical, thus using the estimated fall observational process variance (via posterior distribution) or undercounted, where we assumed a random uniform process between a low undercount of 0.70 and a high undercount of 0.95. We initialized each population at 19,000 and projected the population for 20 yr (similar time span as the current fall counts) for 1,000 projections under an average annual population change of 5.0% decline, 2.5% decline, no change (stable population), 2.5% growth, 5.0% growth, and 5.0% growth for 10 yr followed by 10 yr of 5.0% decline. For each simulation, we calculated the MTYA and fit the HBTS model where the observational process was correctly and incorrectly assumed (observations were undercounts and we used the model with a symmetrical observational process, or observations were symmetrical and we used the model with an undercounted observational process). We compared the mean population predictions of the HBTS model with the MTYA estimator by deriving empirical distributions of annual bias and correlation.

## RESULTS

### Rocky Mountain Population Surveys

The spring population count (SD(*C*) = 3,200) was more variable than the fall population count (SD(*C*) = 1,939; Figure 1). The fall population counts and index generally stayed within the population objective of 17,000–21,000. Fall counts in 2007, 2008, and 2010 exceeded the upper population objective, which pushed the index out of the objective in 2008–2009, primarily due to the largest fall count of 22,822 in 2007. The largest annual difference between sequential years in the full time series (spring and fall counts) occurred between 1985 and 1986, with a drop in the counts of 7,227. The second-largest difference in the full time series and the largest in the fall count occurred between 2012 and 2013, with an increase of 4,077. For the spring survey, 92% of counts indicated at least ±10% change, while for the fall counts only 44% demonstrated that much change. The fall population index had no changes greater than ±10%, with 77% of changes less than ±5%.

### Annual Spring and Fall Observational Variation

The annual changes in the spring counts were extreme in most years (Figure 1, Table 1, Supplemental Material Figure S4). The absolute difference between the expected count from the realistic scenario and the observed count ranged from 839 to 8,014. Out of 12 comparisons, no observed counts were within the bounds of the realistic expected distribution and only 3 were between the most extreme low and high quantiles of the low and high empirical distributions, respectively.

## TABLE 1.

Evaluation of the biological plausibility of the spring counts of the Sandhill Crane Rocky Mountain population. We considered different survival parameters for 3 scenarios to calculate expected counts: (1) unrealistically low survival, with harvest mortality being completely additive (low); (2) empirical survival from a 23-yr mark–resight study and harvest that was compensatory up to natural mortality (realistic); and (3) an upper maximum survival wherein no death occurred (high). We investigated the biological plausibility of annual change in the counts by estimating the absolute difference between the realistic expected count and the observed count and whether the observed count fell within the realistic empirical distribution or between the low and high distributions. No expected counts were generated for the first year of the survey because there was no prior data available with which to make predictions. Survey conditions were assessed as Poor or Good depending upon whether surveys were impeded by logistical constraints, such as those caused by poor weather.

We found annual changes in the fall counts to be less extreme than those in the spring counts, but largely still not biologically realistic (Table 2, Supplemental Material Figure S5). The absolute difference between the expected count from the realistic expected distribution and the observed count ranged from 98 to 3,807. Out of 14 comparisons, 3 were within the bounds of the realistic expected distribution, and 7 were between the most extreme low and high quantiles of the low and high empirical distributions, respectively. We did find the annual changes in the fall population index to be generally biologically reasonable (Table 3, Supplemental Material Figure S6). The absolute difference between the expected population index from the realistic scenario and the observed index ranged from 20 to 1,655. Out of 16 comparisons, 12 fell within the bounds of the realistic expected distribution and all were between the most extreme low and high quantiles of the low and high empirical distributions, respectively.

## TABLE 2.

Evaluation of the biological plausibility of the premigratory fall count of the Sandhill Crane Rocky Mountain population. See Table 1 for explanations of low, realistic, and high expected counts.

## TABLE 3.

Evaluation of the biological plausibility of the premigratory fall 3-yr moving average population index of the Rocky Mountain population of Sandhill Cranes. See Table 1 for explanations of low, realistic, and high expected counts used to generate the corresponding indices.

### Estimates of Annual Observation Variation

The assumption of the bias associated with the initial count (i.e. underdetecting, no mean detection error, and overdetecting) had a significant influence on the posterior distribution of the subsequent detection for both the spring and fall models. If we assumed that the 1984 spring count was unbiased, the subsequent posterior counting variation mostly indicated that the counts overestimated population size in most years (Figure 2). Also, if we assumed that the 1997 fall count was unbiased, the subsequent posterior counting variation included mostly underestimates of population size, with 3 large underdetections occurring in 2001, 2011, and 2012 (Figure 2).

### Hierarchical Bayesian Time Series Population Model

We found no evidence that the HBTS models did not fit our data (0.4 < all Bayesian *P*-values < 0.8) or indicated additional temporal dependency (Supplemental Material Figure S7). When we considered the symmetrical observation process with relatively uninformative priors, we found no evidence of mean population change over the duration of the spring or fall surveys, although there was considerable uncertainty (Figure 3). We also found no evidence of negative density dependence (Spring: β_{1} mode = −1.16, 95% highest point density interval (HPDI) = −1.96 to −0.11; Fall: β_{1} mode = −0.73, 95% HPDI = −1.56 to −0.10). The observational variance was greater in the spring counts than in the fall counts, with 80% of the probability density of the spring observational process variance greater than that of the fall observational process variance (Spring: mode = 0.16, 95% HPDI = 0.00–0.25; Fall: mode = 0.09, 95% HPDI = 0.00–0.12). We did not find a difference in the process variance, with 57% of the probability density of the spring process variance greater than that of the fall process variance (Spring: mode = 0.001, 95% HPDI = 0.000–0.240; Fall: mode = 0.001, 95% HPDI = 0.000–0.140). There was minor evidence for the spring observational variance being greater than the process variance (66% of the probability density of the observational variance was greater than that of the process variance), and no evidence that this was true for the fall (49% of the probability density of the observational variance was greater than that of the process variance).

Spring and fall population indices were inconsistent with the HBTS predicted population mean, with the indices being much more variable (Figure 3). There was a high probability that the RMP had stayed within the population objective since the start of the fall population survey (assuming a symmetrical observation process), with the probability density of the annual fall population prediction between 17,000 and 21,000 ranging from 87% to 100%. There was less certainty for the population over the duration of the spring survey, with probability densities between 17,000 and 21,000 ranging from 52% to 78%. Fitting the population model with informed prior knowledge of the variance parameters produced moderate dampening of the predicted population mean and associated uncertainty (Supplemental Material Figure S8). When we assumed that the population could only be underdetected (the “under” scenario), we found that the population mean exceeded the RMP objective to a small degree, but with a large amount of uncertainty that exceeded the population objective (Figure 4).

The predicted population from the HBTS model using the fall counts was biologically reasonable using the fall population model, but only when harvest mortality was compensatory up to natural mortality (Figure 5). Results were more consistent with the fall population predictions when observations were considered to be symmetrical around the population. Harvest that was completely additive to mortality caused a declining population, which was not the case for the predicted population using the fall counts.

### Evaluating the HBTS Model vs. the MTYA Estimator

In our simulation comparison, using the symmetrical observation process we found that the range of expected bias for the predicted mean population across scenarios was −37 to 471 from the HBTS model and −2,140 to 599 for the MTYA estimator (Supplemental Material Figure S9); the HBTS predicted mean was less biased on average and had a higher expected correlation with the true population for all scenarios of population growth, decline, and mixed combinations. The expected bias of the MTYA was positive when the true population was increasing and negative when the population was decreasing. When the true population was stable, there was a small expected bias for the HBTS model and the MTYA estimator, but the variation was slightly less extreme for the HBTS model. When we assumed that counts were underdetected, the expected bias from the HBTS model was considerably less than that from the MTYA estimator for all scenarios (range = −554 to 816 and −6,164 to −1,050, respectively), while the correlation was similar between the 2 methods (Supplemental Material Figure S10). When the counts were only underdetecting the population (the “under” observational process), but the observational process of the HBTS model was symmetrical, we found that both the HBTS model and the MTYA estimator were strongly biased low (range = −4,557 to 1,788 and −5,494 to −1,036, respectively), with the HBTS model having a higher correlation with the true population (Supplemental Material Figure S11). The expected bias was slightly lower for the MTYA estimator when the population was decreasing and slightly higher when the population was increasing. When the counts were symmetrical but the observational process of the HBTS model underdetected the population, we found the MTYA estimator to have less expected bias (range = 1,889 to 3,199 and −1,242 to 717, respectively) while the correlation was mostly similar but varied by population change (Supplemental Material Figure S12).

## DISCUSSION

Monitoring animal populations is challenging for a number of reasons. Spatial and temporal variability in animal movement, coupled with complex life histories, produce logistical challenges that are not easily overcome with limited budgets; these issues often impede the adoption of appropriate statistical designs for making inferences about true population size. Even with considerable effort devoted to monitoring a population, including attempts to adjust for availability, visibility, or countability, at least one of these sources of variability will likely remain latent and thus results will still be an index of the population (e.g., availability for Mallards [*Anas platyrhynchos*]; U.S. Fish and Wildlife Service 2014b, Johnson et al. 2015). Monitoring programs that use unadjusted counts of animals or counts smoothed over recent years as a population index may find a more formal time series modeling approach much to their benefit.

In comparing the MTYA estimator with the HBTS model, we found that the HBTS model demonstrated overall better performance, as long as the correct observational process was used. Understanding this process is clearly important, as RMP predictions under symmetrical and underdetection observational processes were noticeably different, which has implications for the annual probability of meeting the management objective. Many monitoring programs are unlikely to equally overcount as undercount in each year, making the use of the MTYA estimator illogical. If monitoring were to constantly undercount the population, we would expect the MTYA to be lower than the current count during increasing counts and higher than the current count during decreasing counts. For threatened species, overestimation may be much worse than cautious underestimation. Underestimation of a harvested species could be of concern, depending on the relative risks of losing harvest potential and maintaining a population objective. Practitioners should seek to identify the most influential factors affecting their observational process (see Strobel and Butler 2014).

A significant advantage of the HBTS model is that it allows knowledge of population dynamics and how the counts vary to directly inform the model, and thus the model helps to specify more realistic population predictions. While we provided a limited exploration of using prior knowledge to inform both the observational and population processes, more rigorous deliberation about prior specifications among crane researchers and managers could lead to significant improvements when applied, which could, in turn, reduce the large uncertainties in population predictions seen in our results. Another benefit of the HBTS model is that it provides measures of uncertainty. In contrast, there is often no measure of precision associated with the MTYA when applied (U.S. Fish and Wildlife Service 2003, Utah Division of Wildlife Resources 2015), thus ignoring an important source of uncertainty when making management decisions. Recognizing the many sources of uncertainty in ecological modeling and decision-making is paramount for accurately conveying the state of knowledge of a system and providing robust predictions useful for making management decisions (Regan et al. 2002). Uncertainty can be directly tied to management decision-making through the definition of a management objective; for example, through recasting the population objective for a species recovery plan as meeting a lower confidence bound of a population estimate. This type of objective would faithfully apply the cautionary principle, by not changing management practices until there was a high degree of certainty that the population had recovered to the desired level.

Additional strengths of the HBTS model are that it makes use of information about the entire time series to make inference about parameters, it can easily handle missing years of data, and it can be used to forecast the population. Being able to forecast the population is especially useful, depending on the timing of data availability relative to decision making (e.g., U.S. Fish and Wildlife Service 2013.). We should recognize, however, that time-series modeling approaches (e.g., MTYA, HBTS) are simplistic, nonmechanistic approaches that do not contain real population parameters (e.g., age-specific survival and reproduction). As such, they are less desirable than monitoring vital rate dynamics, but are usually much more financially and logistically feasible and can still produce biologically reasonable population predictions, as we have demonstrated.

There are many alternative approaches to estimating population trends (Humbert et al. 2009, Hosack et al. 2012). The usefulness of each will to some degree depend on what is known about sampling variability and how this is translated into modeling the observational process. One approach that has recently shown promise is using the flexible normal inverse Gaussian distribution to describe observations, which, combined with prior knowledge, could help to specify a useful population model (Hosack et al. 2012); preliminary investigations fitting the RMP crane counts did not produce largely different population predictions from those of the HBTS model (data not shown). Another approach is to exploit spatial replication of some surveys within years, such that detection probability can be estimated via an *N*-mixture model that is then linked to a population model (e.g., Hostetler and Chandler 2015); preliminary investigations fitting the RMP crane counts using this approach indicated high sensitivity to specification of priors and necessitated unrealistic assumptions about the detection process across space and time in order to produce realistic predictions (data not shown).

When monitoring is directly linked to an explicit objective, such as in a structured decision or adaptive resource management framework, whether an index is an appropriate measure will depend on the objective(s). Here, we have implicitly taken the view that an accurate and precise estimate of population size is best when monitoring a population. For investigating population dynamics, this will almost universally be true. But for decision-making, one does not necessarily require perfect information in order to achieve a good decision. Simply, the decision needs to be robust to uncertainties, one of which may be the state of the population. A test could be to question whether a decision made using an index would be the same, similar, or completely different if the population state were known with certainty (Kendall and Moore 2012).

### Sandhill Crane Populations

The objective for the RMP is to maintain a MTYA fall premigratory staging area index between 17,000 and 21,000 (Subcommittee on Rocky Mountain Greater Sandhill Cranes 2007). This objective is intended to allow recreational opportunities and population growth while minimizing major agricultural crop depredation; the balance between these factors is thought to be achieved when the population is within 10% of 19,000 cranes. As such, the implicit population objective is to maintain 19,000 cranes ± 10%. However, because of monitoring uncertainties, the population is understood to be imperfectly observed and the fall counts are thought to be “minimum” estimates of population size (Subcommittee on Rocky Mountain Greater Sandhill Cranes 2007). The reason for this is not explicit in the management plan and, while our results demonstrate biologically unfeasible changes in spring and fall counts, the main sources of variability are unknown.

There are many reasons why the RMP counts might not reflect realistic population change. First, the counts rely on surveys of traditional staging areas, the use of which may vary annually due to migration phenology from breeding area to staging area and staging area to migration stopover in the San Luis Valley. Also, distributional shifts in staging areas are likely to occur because of shifting agricultural practices, increasing development pressure near staging areas, and timing of hunting seasons (Drewien et al. 1996). The use of coordinated ground counts along the flyway attempts to mitigate the potential mismatch in timing of the staging area counts and migration phenology, but birds that don't use the nontraditional staging grounds are still likely to be missed. Fall counts may always be an underestimate of population size if some proportion of the population does not use the fall staging areas and thus is never available to be counted. Moreover, if this counting availability changes annually due to environmental conditions, it might cause significant variation that could obscure small to moderate changes in population dynamics; it could also reduce the reliability of the trend of the population index. Lastly, counting large flocks accurately can be difficult, and many small flocks can be easily obscured by topography or vegetation. Little empirical evidence is currently available on the movements of RMP cranes leaving the breeding and staging areas; this knowledge could improve our understanding of the above-mentioned sources of sampling variation. One study has been undertaken that followed RMP cranes fitted with radio-transmitters, and this research did identify variation among individuals and years in the timing of movements from the breeding to the staging areas (Drewien et al. 1999). A telemetry study that is currently underway (Collins et al. 2016) could help our understanding of movement patterns in association with the fall survey, similar to what has been done for the Mid-Continent Population of Sandhill Cranes (Pearse et al. 2015).

We found that the spring and fall counts varied beyond what was biologically realistic, such that their use in exploring spatial or temporal dynamics should not be considered without some form of adjustment for annual sampling variation. Interestingly, we found that annual variation in the fall population index was largely biologically realistic, as well as predictions of the HBTS model when harvest was compensatory, suggesting that simple time series approaches may be useful for characterizing population dynamics. As expected, we found that the variance of the spring observational process exceeded that of the fall, suggesting that transitioning from the spring to the fall survey did produce some improvement in population estimates. Despite uncertainties about the relationship between the fall population index and true abundance, there is good reason to believe that the current RMP objective is being met.

Ultimately, the costs of estimating true abundance may exceed the decrease in the risks of not meeting the population objective for the RMP. Whether these costs are worthwhile depends on the risks of not knowing true abundance or its relationship with the population index. Understanding the risks depends largely on the magnitude and variability of the sources of variation in the counts and thus what the population index actually represents, as well as the tolerance of managers to the probability of failing to meet the population objective in a given year. Future research is needed to clarify the risks of not meeting annual population objectives using the current decision framework and possibly alternative frameworks, such as adaptive resource management.

## ACKNOWLEDGMENTS

We are grateful to M. Hooten, Y. Wei, P. Doherty, and 2 anonymous reviewers for providing highly beneficial editorial comments on a previous version of this manuscript.

**Funding statement:** Funding was provided by the U.S. Fish and Wildlife Service Mountain-Prairie Region Migratory Bird Program and Webless Migratory Game Bird Research Program. None of the funders had any input into the content of the manuscript, nor required their approval of the manuscript before submission or publication. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

**Ethics statement:** There was no contact with animals for this study.

**Author contributions:** B.D.G. performed the experiments (collected data, conducted the research), developed or designed the methods, and analyzed the data; and B.D.G. and W.L.K. conceived the idea, design, and experiment, wrote the paper, and contributed substantial materials, resources, or funding.

## LITERATURE CITED

*Antigone canadensis*). InThe Birds of North America ( P. G. Rodewald, Editor). Cornell Lab of Ornithology, Ithaca, NY, USA. 10.2173/bna.31 Google Scholar

*Enhydra lutris nereis*). U.S. Fish and Wildlife Service, Portland, OR, USA. Google Scholar