Many models in population ecology, including spatial capture–recapture (SCR) models, assume that individuals are distributed and detected independently of one another. In reality, this is rarely the case – both antagonistic and gregarious relationships lead to non-independent spatial configurations, with territorial exclusion at one end of the spectrum and group-living at the other. Previous simulation studies suggest that grouping has limited impact on the outcome of SCR analyses. However, group associations entail not only spatial clustering of activity centers but also coordinated space use by group members, potentially impacting both ecological and observation processes underlying SCR analysis. We simulated SCR scenarios with different strengths of aggregation (clustering of individuals into groups with shared activity centers) and cohesion (synchronization of detection patterns of members of a group). We then fit SCR models to the simulated data sets and evaluated the effect of aggregation and cohesion on parameter estimates. Low to moderate aggregation and cohesion did not impact the bias and precision of estimates of density and the scale parameter of the detection function. However, non-independence between individuals led to high levels of overdispersion. Overdispersion strongly decreased the coverage of confidence intervals around parameter estimates, thereby increasing the probability of erroneous predictions. Our results indicate that SCR models are robust to moderate levels of aggregation and cohesion. Nonetheless, spatial dependence between individuals can lead to false inference. We recommend that practitioners 1) test for the presence of overdispersion in SCR data caused by aggregation and cohesion, and, if necessary, 2) correct their variance estimates using the overdispersion factor ĉ . Approaches for doing both are described in this paper. We also urge the development of SCR models that incorporate spatial associations between individuals not only to account for overdispersion but also to obtain quantitative information about social aspects of study populations.
The majority of models in population ecology assume that individuals are independently distributed and detected. Although this assumption is rarely mentioned, it applies to commonly used models like Poisson GLMs, capture–recapture (CR) models, distance sampling methods and other hierarchical approaches for estimating population level parameters such as abundance and density (Knape et al. 2011).
Spatially explicit capture–recapture (SCR) techniques were introduced in 2004 (Efford 2004) and their use has since expanded rapidly (Borchers and Efford 2008, Royle et al. 2013). SCR models enable investigators to utilize the information contained in the spatial configuration of individual detections and non-detections to generate spatially explicit estimates of density and abundance (Efford and Fewster 2013, Royle et al. 2013). Importantly, SCR provides an approach to scale up ecological processes at the individual level to populations and landscapes (Chandler and Clark 2014, Royle et al. 2018).
Like conventional CR, SCR models assume independence among individuals (Borchers and Efford 2008, Royle et al. 2013). In practice, this assumption is often violated by interactions in the spatial configuration of individuals, such as territoriality, group-living or periodic aggregations (e.g. feeding and breeding). Non-independence of individuals, such as clustering into groups, is being recognized as a potential source of bias and inflated precision in capture–recapture analysis and other hierarchical methods (Anderson et al. 1994, Pradel et al. 2005, Royle 2008, Reich and Gardner 2014, Muneza et al. 2017, Hickey and Sollmann 2018). Because the spatial attribution of individuals and of sampling are explicit in SCR, violations of independence could be of particular concern for these models, where the ecological process may be impacted by the non-independence in the distribution of individuals and the observation process by the non-independence of individual detections.
Despite these concerns, SCR models are being applied to species with various degree of sociality, including group-living ones, such as some members of cetaceans (Marques et al. 2012), canids (López-Bao et al. 2018), ungulates (Muneza et al. 2017) and primates (Granjon et al. 2017). Even species not considered as group-living can temporarily form close associations between two or more individuals, e.g. during calving, breeding or when raising offspring (Bonenfant et al. 2004, Bischof et al. 2017).
Here we use simulations to study the consequences of ignoring the spatial configuration of individuals into groups in SCR models. Recently, López-Bao et al. (2018) used simulations to explore the effect of spatial aggregation of individuals in SCR models and concluded that it led to negligible bias in abundance estimates, at least when all of the available habitat was sampled. However, in this and another investigation involving SCR (Russell et al. 2012), grouping was limited to a spatial clustering of activity centers without consideration for dependent space use at the home range scale and thus correlated detection patterns of group members. Our study expands upon this work by addressing non-independence in both the spatial distribution of home ranges and in individual space use within home ranges and thus detection patterns.
For the purposes of our investigation, we define two terms to describe group-association as it relates to the spatial distribution of individuals: aggregation and cohesion. Aggregation denotes the degree to which individuals in the population coalesce into groups, with all individuals solitary at one end of the spectrum (Fig. 1a) and with all individuals aggregated into a single group at the other end (Fig. 1b). For simplicity, we assume that all members of a group share identical activity center (AC) locations and the same home range size, but deviations from these assumptions are readily accommodated. Cohesion refers to the degree of correspondence in the pattern of home range utilization – and consequently detection patterns – of group members. At one extreme, all group members use the shared home range independently (Fig. 1c). At the other extreme, individuals move through and use the shared home range in unison, being detected at the same sites synchronously (Fig. 1d). While still relatively simple, this representation of group-association as a function of both aggregation within the population and cohesion among group members may be more apt to describe the relevant processes that could influence the outcome of SCR analyses. This is a reasonable premise, because SCR analyses model both the spatial distribution of home ranges (latent AC locations) and individual probability of detection within home ranges (implicit in the detection function), while assuming independence between individuals in both processes.
Non-independence between individuals leads to overdispersed observation data and consequently underestimated sampling variance (Fletcher 2012). This problem has been explored extensively in the capture–recapture context, where it can arise from a variety of processes (Lebreton et al. 1992, Pradel et al. 2005, Choquet et al. 2009), including group association (Anderson et al. 1994). Considering non-independence between individuals as a source of overdispersion may help better understand the consequences of grouping also in SCR analyses. Furthermore, this perspective offers a potential solution to deal with overdispersion via the application of a variance inflation factor (Schmidt and Anholt 1999, Cam et al. 2004). We simulated populations for different combination of aggregation and cohesion to explore the effect of grouping on the main parameter estimates of interest of SCR analyses: density and the scale parameter of the detection function.
Material and methods
Our objective was to explore the consequences of group-association for parameters estimated with SCR models. To this end, we simulated populations with different levels of aggregation, cohesion and density and exposed them to a virtual detection process. This approach yielded datasets that mimicked observations from real-life populations with varying degree of independence between individuals. We then fitted a simple SCR model that assumed independent distribution and detection of individuals to each simulated data set. Finally, we quantified overdispersion in the data and evaluated the performance of each model in terms of precision, bias and coverage of the 95% confidence intervals associated with key parameter estimates.
Habitat and detector grid
The habitat was defined as a square of 20 × 20 distance units (du). A square 12 × 12 detector grid (1 du detector spacing) was centered on the habitat, leaving a 4.5 du wide habitat buffer around the detector grid (Fig. 2).
Aggregation and AC placement
Spatial aggregation of individuals was simulated as a clustering of individual activity centers into groups of increasing size α (and therefore decreasing number of groups, Fig. 2). Given a total population size N, α can take values between 1 (all-solitary) and N (aggregation of the entire population into a single group Fig. 2). Although α typically varies between groups in real populations, in our example we assume that populations are made up of groups of identical size. All members of a group shared the same AC location during simulations. Group ACs were drawn randomly from the square habitat. We assume that N is fixed, and thus the model represents a binomial point process based on a uniform intensity surface (Illian et al. 2008).
Cohesion and detection
The basic SCR model assumes a direct link between the probability (or frequency) of detecting individual i at detector j and the distance dij from this detector to the individual's AC. The diminishing detection probability with increasing distance from the AC is often modeled using the half-normal detection function:
where p0 and σ are the magnitude and scale parameter of the detection function, respectively. Observations (detection: y=1 and non-detection: y=0) are then realized as the outcomes of a Bernoulli process such that
This formulation of the observation process encompasses both the variation in detection probability across space due to individual home range utilization, as well as the efficiency of the detection process (e.g. search effort). Although SCR analysis, like CR, often model multiple detection occasions, this is not indispensable, as replication of detection opportunities is provided through the spatial dimension (multiple detectors) and thus makes detection probability identifiable during a single occasion (Efford et al. 2009). We opted for a single-occasion model with binary detection in this study, as this most closely resembles data we are working with in a current project – non-invasive genetic monitoring of large carnivores in Scandinavia – where we have limited information about the temporal structure of searches or sample deposition by individuals (Milleret et al. 2018).
We modeled cohesion (Fig. 1c–d) as a probabilistic mixture of two extreme detection patterns for each individual: an independent detection pattern yij (but still conditional on its group-specific AC) and a group-specific detection pattern ygj. For each group g (g in 1…G), the group-specific detection history was modelled as:
Then, each member of the group randomly follows the group-specific pattern or makes an independent choice with probability γ (cohesion) and 1–γ, respectively:
This process induces dependency – in terms of adherence to a common detection pattern – between members of a group, ranging from full independence (γ = 0) to an identical pattern of detections (γ = 1).
Equations 3–6 generate cohesion by cloning individual detection patterns of group members, which is also a way overdispersion has been introduced in the non-spatial CR literature (Anderson et al. 1994). The mechanism behind such correlation in detection patterns among group members is their spatio–temporal synchronization of space use (shared movements and home ranges) and thus exposure to detectors. If an individual has been detected at a given detector, it makes detection of its group members at that detector more likely, but not certain. Even in cases where spatial association is nearly perfect, detection patterns of group members may not be identical. The case representing γ = 1 is thus unlikely to be encountered in studies of real populations.
For the simulations, we set σ = 1.5 du and the baseline detection probability p0 = 0.1. The total number of individuals within the habitat was kept constant (n = 128) across all simulations and translates into an overall density of 0.32 individual ACs per du2. We generated 40 different scenarios from different combinations of values of cohesion γ (0, 0.25, 0.5, 0.75, 1) and aggregation α represented by group sizes of 1 (all solitary), 2, 4, 8, 16, 32, 64 and 128 (entire population in one large group, Fig. 2) We ran 1000 simulations for each scenario, resulting in 40 000 simulated data sets.
López-Bao et al. (2018) reported a slight negative bias in SCR-based abundance estimates from simulations using an unsampled buffer area around the detector grid, but not when the entire available habitat was sampled. In order to test for the role of a buffer in modulating the impact of aggregation and cohesion on inferences, we repeated the analysis with an equivalent habitat size (20 × 20 du), population density, σ and p0, but with a detector grid (same spacing) that covered the entire available habitat. Finally, to explore the impact of a changing spatial extent (and thus population size, detections and the proportion of the total population constituted by a group), we halved the habitat square together with a corresponding reduction in the detector grid, without changing the density of ACs. This led to a simulated population of 64 individuals occupying a 10 × 20 du habitat. Given the lower population size in this third state–space configuration, we used group sizes of 1, 2, 4, 8, 16, 32 and 64 individuals and generated 35 000 simulated datasets.
SCR model fitting and evaluation of model performance
We fitted a basic SCR model which accounted for neither aggregation nor cohesion (Supplementary material Appendix 3) to each simulated dataset. Models were fitted in R (< www.r-project.org>) using function secr.fit from package ‘secr’ ver. 3.2.0 (Efford 2015). R code for implementing simulations and fitting an SCR model to simulated data are provided in the Supplementary material Appendix 1.
We used relative bias, coefficient of variation and coverage to evaluate the effect of aggregation and cohesion on estimates of density D and the scale parameter σ of the detection function.
Relative bias (RB) was calculated as:
The precision of each parameter estimate was assessed through its coefficient of variation (CV; Walther and Moore 2005):
Where SD() is the standard deviation of the estimate. We calculated the coverage of the 95% credible interval (‘coverage'), i.e. the probability that the 95% confidence interval of the parameter estimate contains the true value of that parameter.
Non-independence between individuals (i.e. aggregation) might lead to overdispersion in CR datasets (Anderson et al. 1994), we therefore calculated the overdispersion factor ĉ from counts of unique animals per detector (Fletcher 2012):
where and are the estimated mean and variance respectively, the are the independent random variables (i.e. the counts of unique animals at the j=1…k detectors), ρ is the number of parameters in the model (i.e. 1 in our case as we assume homogeneous density D) and where and .
An overdispersion factor ĉ > 1 signifies overdispersed data, i.e. when the variance in the data exceeds that predicted by the statistical model. We compared ĉ to the relative variance (RV) of , which we calculated as the ratio of the empirical variance among simulations var() for any given simulation scenario and the variance var0 () associated with the independent scenario (α=1, γ=0):
We did not attempt to fit models to simulated data if no individuals were detected or if the average number of detections per detected individual (i.e. the number of detectors at which an individual was detected) was equal to 1. These constraints were implemented as they represented conditions that would lead to model fitting failure. We also removed from further analysis additional simulations that, despite meeting aforementioned criteria, resulted nonetheless in failure during attempted fitting (e.g. inability to estimate key parameters) due to sparse data.
Results presented in this section, unless otherwise indicated, refer to the simulated cases with a 12 × 12 detector grid surrounded by a 4.5 du unsampled habitat buffer, amounting to a total available habitat of 20 × 20 du. Additional results for the other two state–space configurations (20 × 20 du and 10 × 20 du without unsampled habitat buffer) are provided in the Supplementary material Appendix 1. Models were successfully fitted to all simulated datasets with low and moderate levels of aggregation (α ≤ 8), regardless of the level of cohesion. Increasing non-independence – both aggregation and cohesion – was associated with a growing proportion of failed models. At maximum aggregation (α = 128) and cohesion (γ = 1), 88% of models failed (Supplementary material Appendix 1 Fig. A2, Table A1).
Effects on bias and precision
We detected little systematic bias (<2%) in estimates of density (D) and the scale parameter of the detection function (σ) at low and moderate levels of aggregation (α ≤ 8) and cohesion (γ ≤ 0.25, Fig. 3a, c, Supplementary material Appendix 1 Table A2). A positive bias in estimates of D was noticeable at high levels of aggregation, an effect that was amplified by cohesion (e.g. at α = 64, median RB = 23% when cohesion = 0 and RB = 37% when cohesion = 1; Fig. 3a, Supplementary material Appendix 1 Table A2). The coefficient of variation (CV) of both D and σ decreased at high levels of aggregation (α ≥ 32) and cohesion (γ = 1, Fig. 3d). However, even at high levels of aggregation and cohesion, the 95% quantiles around estimates of overall RB and CV in most cases overlapped the nominal values of these measures (Supplementary material Appendix 1 Table A2).
Simulations with the entire habitat sampled were characterized by a similar absence of bias and changes in precision at low and moderate levels of aggregation, regardless of the level of cohesion (Supplementary material Appendix 1 Fig. A3, A4, Table A4, A6). Patterns in bias and CV at high levels of aggregation and cohesion were also qualitatively similar to those in simulations with an unsampled habitat buffer. Results from the half-sized (10 × 20 du) and full-sized (20 × 20 du) habitat without buffer were nearly identical to each other qualitatively and quantitatively, but with lower among-simulation variation in key parameters in the latter, due to the larger sample size (larger population size and more detectors, therefore more detections, Supplementary material Appendix 1 Fig. A3, A4, Table A4, A6).
Overdispersion and coverage
Overdispersion increases with both cohesion and aggregation (Fig. 4). At maximum cohesion (γ = 1), group size is equal to overdispersion and predictive of the inflation in empirical variance of estimates of D and σ among simulations (Fig. 4). Reducing cohesion also reduced overdispersion at a given level of aggregation. Overdispersion is manifested as overstated precision; coverage of the 95% confidence intervals of both D and σ was therefore drastically affected by aggregation (Fig. 5). Coverage was nominal (95%, Fig. 5a) in simulated populations composed entirely of solitary (independent) individuals but dropped rapidly as aggregation increased. Decreasing cohesion mitigated the negative impact of aggregation on coverage, an effect that was more pronounced for σ than for D (Fig. 5), at least in simulations that included an unsampled habitat buffer around the detector grid (but see Supplementary material Appendix 1 Fig. A5, A6). With the exception of extreme dependence (maximum aggregation and cohesion), aggregation and cohesion had diminished effects on coverage in simulations where the detector grid covered the entire available habitat compared with simulations that included an unsampled habitat buffer (Fig. 5, Supplementary material Appendix 1 Fig. A5, A6, Table A4, A6). In all three state–space configurations, coverage of D and σ returned to near their nominal values for most combinations of aggregation and cohesion once we corrected the variance estimates using the inflation factor ĉ(corrected var (θ)) = ĉ var); Fig. 5, Supplementary material Appendix 1 Fig. A5, A6). One exception was coverage of D in the configuration with a habitat buffer: here, coverage remained comparatively poor at high aggregation and low cohesion (Fig. 5b).
Our simulation study revealed that spatial capture–recapture analyses are robust to low and moderate levels of spatial dependence between individuals, at least in terms of bias. However, both aggregation and cohesion led to overdispersed data and decreased the coverage of key parameter estimates. Thus, spatial association among individuals in a population, if ignored, can lead to unreliable inferences. Understanding aggregation and cohesion as a source of overdispersion could provide access to a framework for goodness-of-fit testing (Pradel et al. 2005) and model selection (Anderson et al. 1994) and, as we have done here (Fig. 5), correct underestimated sampling variance and poor coverage of confidence intervals (Lebreton et al. 1992).
A growing number of studies and monitoring projects employ SCR to estimate density and other ecological parameters (Chandler and Clark 2014, Royle et al. 2018). Many of the species targeted exhibit various degrees of group association, from temporary affiliations during breeding or offspring care to long-term congregation into large groups. What do our findings mean in terms of aggregation and cohesion in real life populations that may be studied using SCR? Investigators that use SCR to estimate population density or abundance for solitary species or species with small group sizes (such as family groups of cougars consisting of a female and her dependent young; Russell et al. 2012) can likely trust their inferences even if they ignore grouping. However, there is indication that coverage starts dropping already here, especially if cohesion is high, as is most likely the case for dependent offspring (Fig. 5). As both aggregation and cohesion increase, estimates of D and σ become less reliable (lower coverage). For example, inferences will more likely be flawed for a population of ungulates that consists of small and cohesive herds than for a species of marmots, where individuals live in family groups of similarly small size but where group members perform mostly independent movements (Armitage 2014). At very high levels of aggregation, such as communally roosting bats with independent nightly hunting forays (low cohesion; Audet 1990) or populations of fish that form large schools (high cohesion), inferences might be severely impacted when grouping is ignored.
In general, we recommend that investigations into the reliability of ecological models in the face of violated assumptions consider coverage, as bias and precision alone may provide a more optimistic assessment than is warranted. As aggregation and cohesion increase, investigators run a growing risk of obtaining erroneous estimates of key parameters (i.e. with 95% CI limits that do not include the true parameter value). For example, in our simulations of a population composed on 128 individuals, even moderate aggregation into groups of eight individuals, led to noticeably reduced coverage of D (67%) at cohesion γ = 0.75 (Fig. 5). In the habitat/detector grid configuration that included an unsampled habitat buffer, the effect of aggregation on coverage was most pronounced for estimates of density, where increasing group sizes led to a rapid decrease in coverage, even at low levels of cohesion (Fig. 5, Supplementary material Appendix 1 Table A2).
Poor coverage at high levels of aggregation and cohesion results from overdispersion in the data. In general, we found that aggregation and cohesion had stronger positive effects on overdispersion (and thus negative effects on coverage) in simulations with a habitat buffer around the detector grid (which is the case in most SCR studies) than in simulations where the entire habitat was sampled. Consistent with these findings, López-Bao et al. (2018) reported lower coverage of population size in simulations that included an unsampled habitat buffer around the detector grid. Using Fletcher's (2012) method to estimate the overdispersion factor (Fletcher's ĉ) from count data, we were able to correct our precision estimates of the parameters of interest and recover nominal levels of coverage in most aggregation and cohesion situations. However, we also found that the relationship between the empirical overdispersion (as measured by RV) and Fletcher's ĉ changes at very high levels of aggregation (Fig. 4b–c). Thus, at high levels of aggregation, the mitigating effect of the correction factor is diminished (Fig. 5b, d, Supplementary material Appendix 1 Fig. A5, A6).
The apparent non-monotonic relationship between group size and other measures such as RV and coverage appears to be an artifact of the removal of failed simulations (Supplementary material Appendix 1 Fig. A2, Table A1, A3, A5). Similarly, filtering of simulations at extreme levels of aggregation and cohesion is the likely culprit behind deviations of RB and CV from their nominal values. With increasing cohesion and increasing group sizes, stochastic events are shared by many or all individuals and their influence on the data can become overwhelming. In the extreme case, a population composed of a single group of 128 individuals that all share one identical detection pattern, either all individuals are detected or none (simulation failure). When successfully fitted, estimates of model parameters will be based on clones of a single capture history. However, we reiterate that this scenario is unrealistic: most real-life populations will have average group sizes that are fractions of total population size and detection patterns will differ between group members even if their space use is highly synchronized.
Our approach for simulating non-independence between individuals distinguishes between non-independence of home ranges (shared activity centers) and non-independence due to cohesion of space use by individuals. This advances the conceptual formulation of models beyond previous work (Russell et al. 2012, Reich and Gardner 2014, López-Bao et al. 2018) and improves the diagnosis of areas of the grouping parameter space where current SCR models have a low predictive power and lead to problematic inferences. We picked generic settings for our simulations and tested three different state–space configurations. Qualitatively, we expect the patterns observed in this study to be similar regardless of the specific parameters of the simulated population (N, σ, size and shape of the habitat) and the detection process (p0, density and configuration of detectors). However, the magnitude of the effects of aggregation and cohesion on parameter estimates is liable to vary as the context provided by the ecological and observation processes changes. It would therefore be useful to explore the impact of group-living for a wider range of biological scenarios and with greater realism in future studies. Such investigations could target the potential effects of within-population variation in group size (e.g. through a Poisson cluster process, Diggle 2003) and group structure (e.g. different behaviors according to social status). Finally, in our simulations, cohesion was treated as a property of detection histories rather than movements. The use of mechanistic models of space use and group association would help shed light on which levels of cohesion to expect in practice.
Spatial-capture recapture methods are evolving rapidly. Already, they have been expanded to allow for modelling open populations (Chandler and Clark 2014, Bischof et al. 2016) and to estimate demographic rates (Ergon and Gardner 2014, Chandler et al. 2018), non-euclidean distances (Sutherland et al. 2015), barrier effects (Bischof et al. 2018), connectivity (Morin et al. 2017) and epidemiology (Muneza et al. 2017). In addition to further research to develop an improved correction for overdispersion due to non-independence in SCR models, we encourage development of approaches that explicitly model and thus estimate aggregation and cohesion (see also Reich and Gardner 2014). This would not only provide another solution to problems associated with overdispersion and inflated precision revealed here, but also allow for estimation of biologically relevant quantities, such as measures of the strength of association, group sizes and perhaps even help designate detected individuals to groups by treating group membership as a latent variable. Currently there are no SCR models that can do so, but the conceptual formulation of grouping as a combination of aggregation and cohesion used here may help guide the development of more general SCR models in this area of active research.
Aggregation and cohesion result in overdispersed data and decrease the reliability of predictions generated by SCR models by over-stating precision. As the scope of SCR analyses expands to include an increasing number of species and systems, so does the range of social structures that may influence how individuals are configured in space. Investigators can mitigate the effect of non-independence between individuals by correcting variance estimates for overdispersion. Finally, we see the explicit incorporation of grouping into SCR models as a desirable development. Doing so would not only help mitigate potential bias and imprecision arising from non-independence in the spatial configuration of individuals, but could also provide a means for obtaining additional important quantitative information about the ecology of the studied system. While to our knowledge there are currently no SCR models that incorporate group association, this has already been incorporated into other abundance estimation approaches (Clement et al. 2017, Hickey and Sollmann 2018), which could serve to stimulate developments in SCR. Meanwhile, our simulation framework can be used as a tool for exploring model sensitivity to violations of independence, in cases where grouping occurs but is ignored.
No empirical data were used in this analysis. R code for generating simulated data is provided in the Supplementary material Appendix 2.
– We thank for M. Efford for constructive criticism on an earlier version of the manuscript, for guidance regarding overdispersion, and for providing some of the R code used in the analysis.
Funding – This work was funded by the Norwegian Environment Agency (Miljødirektoratet), the Swedish Environmental Protection Agency (Naturvårdsverket) and the Research Council of Norway (NFR grant 286886). Computation was performed on computer cluster ‘Orion’, administered by the Centre for Integrative Genetics at the Norwegian University of Life Sciences. Any use of trade, firm or product names is for descriptive purposes only and does not imply endorsement by the US Government.
Conflict of interest – The authors declare no conflicts of interest.
Author contributions – RB conceived and designed the study, with input from AR, CM, JC and PD. RB conducted the analysis with contributions from all authors. RB wrote the first draft of the manuscript; all authors contributed to subsequent drafts and gave final approval for publication.
Supplementary material (available online as Appendix wlb-00649 at < www.wildlifebiology.org/appendix/wlb-00649>). Appendix 1–2.