Spring spotlight counts provide reliable indices to track changes in population size of mountain-dwelling red deer Cervus elaphus

Monitoring changes in animal abundance is a central issue in conservation biology. Population indices may be a valuable support to wildlife managers in coarse-scale survey programs, as they normally represent more intuitive and less expensive monitoring tools if compared with absolute estimates. Reliable indices of relative abundance, however, require validation against some known standards. We used mark-resight estimates to investigate the performance of indices derived from spring spotlight surveys to track changes in a mountain-dwelling population of red deer Cervus elaphus within the Stelvio National Park, central Italian Alps. Every spring between 2008 and 2015 we conducted four sessions of roadside-counts using spotlights, recording all sightings of marked and unmarked individuals; the zero-truncated Poisson log-normal estimator was applied in a robust-design fashion to return absolute estimates of spring abundance. We then compared the mark-resight estimates with two indices of abundance, the maximum number (MNC) and the average number (ANC) of deer counted every spring in the four sampling occasions, using linear models on log-transformed data. Both the MNC and the ANC proved reliable indices of relative abundance, as their relationships with mark-resight estimates were positive and highly significant, and the beta coefficients of linear models were not significantly different from 1. The same analysis conducted on subsets of secondary sampling occasions suggested that at least 3 repeated counts every spring are necessary to consistently track changes in deer population size. The reliability of spotlight-based indices to monitoring deer population changes has been widely debated, possibly owing to inconsistent performances of the method in different landscapes. For mountain-dwelling deer populations living in similar habitats, our results suggest that spring spotlight surveys represent valuable tools in support of wildlife managers for long-term, large-scale monitoring programs; furthermore, they can provide appropriate indices to estimating population growth rates and thus modelling deer population dynamics.

Throughout large part of their distribution range, ungulates -and cervids in particular -take on great importance from the scientific, conservation and economic standpoint for both consumptive and non-consumptive reasons (Putman et al. 2011a). Wildlife managers and researchers have thus attempted a number of different methods to monitor deer population abundance (Putman et al. 2011b). Estimates of absolute density include the use of methods such as distance sampling (Focardi et al. 2002, Koenen et al. 2002, markresight (Bowden and Kufeld 1995, Gould et al. 2005, Skalski et al. 2005 or pellet group counts (Bailey andPutman 1981, Fattorini et al. 2011), while abundance indices may include, among the others, the kilometric index (Vincent et al. 1991), the size of deer groups (Vincent et al. 1995) or the number of deer seen in given periods (Mysterud et al. 2007). Recently, Morellet et al. (2007) proposed the use of different ecological indicators (e.g. hind foot length, fawn body mass) that -under the paradigm of density dependence -may account for the interaction between deer density and environmental features.
One of the most popular field methods to collect data on deer abundance relies on the use of spotlights during night roadside counts (Focardi et al. 2001, Collier et al. 2007, Acevedo et al. 2008, Garel et al. 2010. Spotlight surveys may provide the basis for the investigation of absolute abundance through the application of mark-resight (if marked individuals are available: Garel et al. 2010), distance sampling (Larue et al. 2007, Morelle et al. 2012 or double observer methods (Collier et al. 2013). Likewise, they may provide information on relative deer abundance, either represented by the total number of animals seen in a given count (Belant and Seamans 2000), or expressed through kilometric indices using individual deer counts or group counts (Garel et al. 2010, Amos et al. 2014. Spotlight-based surveys found application in several taxa other than cervids, for example in lagomorphs (Aubry et al. 2012), carnivores (Gehrt 2002) as well as in crocodiles (Salem 2010) and fish (Hickey and Closs 2006).
The use of spotlight surveys to obtain indices of relative abundance of deer populations, however, has been widely debated. Progulske and Duerre (1964) for example suggested that the number of white-tailed deer Odocoileus virginianus and mule deer Odocoileus hemionus counted using spotlighting can be means of studying population trends. Similarly, Garel et al. (2010) showed that spotlight-based abundance indices are suitable to monitoring the temporal variations of red deer Cervus elaphus populations, although they do not recommend their use when modelling population dynamics. In contrast, Collier et al. (2013) argued that spotlight surveys are unsuitable to provide reliable information of relative abundance of white-tailed deer, given their great variations in detection probabilities. In fact, because any index of relative abundance (c) assumes direct relationship with the real population size (N) -i.e. c  p  N, where p is the unknown detection probability -one of the main issues in the use of abundance indices relates to difficulties of assuming constant detection probability, as this parameter may change values according to a number of variables such as habitat types, observers or meteorological conditions (Anderson 2001, Morellet et al. 2007). In turn, before using any index of relative abundance to track changes in population size, wildlife managers ought to validate it against some known standards (Morellet et al. 2007), for example by comparing it to estimates of absolute abundance (Garel et al. 2010).
Red deer are often considered as ecosystem engineers because of their ability to affect vegetation community patterns and ecosystem functioning (Côté et al. 2004). The management of red deer has become a priority for the Stelvio National Park (central Italian Alps), as the large population increase that occurred over the last few decades severely impacted on forest management, agricultural activities and on the ecosystem biodiversity (Carmignola 2009). Following the debate concerning the role of deer on human-related activities and on the ecosystems -and in the attempt to plan appropriate management strategies -in agreement with different stakeholders the Park administration started a program to quantify red deer impacts. One of the priorities highlighted during the decision-making process was the need to obtain accurate information about deer population abundance and trend. Several methods to estimate population size (markresight, distance sampling using pellet groups or infrared thermal imaging) were thus conducted in parallel to the spring spotlight counts used in the standard management practice (Pedrotti et al. 2013).
Taking advantage of a sample of marked individuals, in this paper we investigate the reliability of abundance indices derived from spring spotlight counts to track annual changes in deer population size, using mark-resight estimates as a benchmark. Specifically, we aim to test: 1) if abundance indices and mark-resight estimates indicate the same trend over time; 2) if abundance indices and mark-resight estimates change at the same rate over time; 3) how many repeated counts are needed every year to provide indices that can reliably track changes in absolute abundance.

Study area and population
The red deer study population inhabits the northwestern part of the Stelvio National Park, within the Province of Sondrio, central Italian Alps (10°25′N, 46°27′E). Yearround movements of individually marked deer and landscape features (ridges, valleys) helped defining the boundaries of the population management unit (PMU) that extends over 27 900 ha between 1200 and 3850 m a.s.l. (Fig. 1a). The climate of the PMU is alpine continental, with mean temperatures ranging between 15.7°C in July and -2.8°C in January and yearly precipitations of about 765 mm. Forests are dominated by spruce Picea abies and larch Larix decidua, interrupted by mesic meadows (Trisetetum flavescentis) and other xeric associations. Above the treeline Alpine grasslands of Carex curvula, Festuca halleri and Carex firma are the prevalent vegetation facies. The spatial distribution of red deer within the PMU is greatest in summer, with minimum densities of about 6.9 ind. km -2 , while in winter and early-spring animals move to lower elevations, where minimum density can reach values up to 30.8 ind. km -2 (spring spotlight-count data) (Fig. 1a). The study area 'Valfurva' (4975 ha) is the portion of the PMU enclosing the wintering site of the red deer population, between 1200 and 2400 m a.s.l. (Fig. 1a) (Fig. 1b).
Between 2007 and 2015, in autumn and in winter, n  140 deer were captured by the Park personnel. A capturing site (corral) baited with hay was placed within each survey area and trapped deer (n  119) were darted using 2-3 ml of Vienna or Hellabrunner mix. In addition, n  21 free-ranging deer were ground-darted using the same mix of chemicals. Throughout the study period, capturing effort was evenly distributed across the four areas (A, B, C, D).
After sedation, each animal was assigned to a given sex and age-class (calves, yearlings, adults) and equipped with an individually recognizable GPS (Vectronic Aerospace GmbH) collar (n  13 males, n  15 females) or a coloured belt plus two ear tags (n  42 males, n  70 females). All the collars, belts and ear tags had unique patterns of colours; in addition, they were fitted with different combinations of coloured reflectors to facilitate individual recognition during spotlight counts. These methods are in accordance with the Italian law, as captures were made with the assistance of a veterinarian and after receiving authorization from ISPRA (the national Institute for Environmental Protection and Research). Spatial data retrieved from individuals equipped with GPS collars suggest that, in spring, deer tend to consistently inhabit the same survey area (A, B, C or D) within and over the years and only 2.1% of fixes were collected off the study site (Fig. 2). The consistent resighting of marked individuals in the same survey area over different years in springtime (own data) supports this suggestion.

Spotlight counts
Spotlight counts were conducted on the entire study area between 2008 and 2015. Each year (i.e. each primary sampling occasion j ), over two weeks between April and May -starting dates varied depending on vegetation greenup -four counts (i.e. four secondary sampling occasions, hereafter occ1, occ2, occ3, occ4) were conducted in different days, separated on average by four days ( 2 SD); each count lasted about 4 h between 11 p.m. and 3 a.m. A total of four routes (one for each survey area -A, B, C, D - Fig. 1b), consisting of a set of non-overlapping and homogeneously distributed roads, were simultaneously sampled using cars and each route was assigned to a group of three operators: the driver, passenger A who searched for deer using spotlights and passenger B who recorded the size and composition of deer groups. Specifically, passenger B was in charge of recording the number and sex of unmarked individuals, number of unidentified marked individuals and number and identity of individually recognized deer. Although the MR estimator used in this study allows for sampling with replacement within secondary occasions, every effort was made to avoid double sightings of the same individuals within each count. During spotlight counts, the survey roads (i.e. the number of km driven) were kept constant throughout the study period (area A: 24.4 km; area B: 17.3 km; area C: 16.7 km; area D: 34.3 km) and counts were performed only in conditions of good visibility (although weather conditions should not represent a limiting factor during spotlight surveys: Fafarman and DeYoung 1986).

Mark-resight analysis
A common issue in the long-term monitoring of marked populations of long-lived ungulates is that, after the initial capture, the number of marked individuals in the population is often unknown owing to undetected deaths or emigrations. The Poisson log-normal estimator (PNE, McClintock et al. 2009) is the most general of the mark-resight models implemented in program MARK (White and Burnham 1999) and offers the advantage of estimating population size in the absence of knowledge about the exact number of marked individuals available in the study area during sighting trials. When the exact number of marked animals is not known, the PNE switches to a zero-truncated Poisson log-normal estimator (ZPNE, McClintock et al. 2009). (Z) PNE allows for simple random sampling with replacement within resighting occasions for each closed interval (hence it does not require distinction between secondary sampling occasions), for individual heterogeneity and for the inclusion of unknown marked individuals in the estimate. survey area as individual covariates, rather than as group covariates. This approach assumes that the ratio between males and females and the ratio among individuals inhabiting different survey areas are similar between marked and unmarked animals, so that the same mean resighting probability provided by the mark-resight estimator can be applied to both marked and unmarked deer. We thus fitted a set of 10 ZPNE models of increasing complexity using sex as a binary covariate (males  1, females  0) and survey area as an ordinal covariate (D  1, A  2, C  3, B  4, following increasing values of ratio between open and closed habitats) to model the resighting probability a of marked individuals. For the simplest model we assumed constant resighting probability a, while for the full model we modelled resighting probability a as a function of year plus the interaction between sex and survey area. For all models, we allowed for annual variations in parameter U (number of unmarked individuals in the population over primary occasions), while individual heterogeneity s during primary occasion j, survival probability φ and transition probabilities g ′ and g ″ between primary occasions j and j  1 were kept constant. Models were ranked according to their values of Akaike information criterion corrected for small samples (AICc) and a cut-off value of delta  4 was adopted to select competing models (Burnham and Anderson 2002). Mark-resight analysis was performed with the software MARK (White and Burnham 1999) building ZPNE models using the package RMark (Laake 2013) with R ver. 3.1.3 (< www.r-project.org >) in R Studio 0.99.446 (RStudio 2015).
Furthermore, in (Z)PNE both the resighting probability a and the individual heterogeneity s can be modelled as a function of individual covariates and s can be also set to zero (no individual heterogeneity). (Z)PNE requires that: 1) the population remains numerically stable within primary occasions (closure assumption); 2) marks are not lost within primary occasions; 3) there are no errors in distinguishing marked and unmarked individuals; 4) resighting probabilities of animals are independently and identically distributed among sighting trials (McClintock et al. 2009). Finally, (Z) PNE shares the basic assumption of all analyses based on capture-recapture sampling designs, namely that the sample of marks must be representative of the population (Lindberg and Rexstad 2002).
Because in our study site the number of marked individuals available was known only for the first two years, to estimate the annual population size we analysed spotlight data using ZPNE in a robust-design fashion, setting the number of available marks to zero (i.e. unknown) for the remaining six years. As we may expect different resighting probabilities a between males and females and among survey areas (e.g. owing to different proportions of open and closed habitats), one approach could be to estimate sex-and area-specific population sizes using sex and area as group covariates. However, because of some inconsistency in the possibility to properly distinguish between unmarked males and females in spring, and because issues of model convergence occurred when grouping for survey areas, to model resighting probability a we performed mark-resight estimation on the entire deer population using sex and three secondary occasions and six combinations of two secondary occasions), obtained by consistently dropping the same sampling occasion(s) -i.e. occ1, occ2, occ3 or occ4 -over different years. For all the fitted linear models, the Kolmogorov-Smirnov normality test on the residuals was used to check model fitting. Significance level was set to p  0.05. All analyses were performed using R ver. 3.1.3 in R Studio 0.99.446. Table 1 reports the results of the model selection conducted using the zero-truncated Poisson log-normal estimator: only one of the 10 fitted models, with resighting probability a as a function of the interaction between sex and survey area, had ΔAICc  4 and was thus retained as the best model. The estimated mean resighting probability a for this model was 0.51. Annual estimates of absolute abundance are reported in Fig. 3, together with the maximum and average number of deer counted during the four secondary sampling occasions. Table 2 shows a summary of the results obtained during spotlight counts. The ANC obtained using all the four secondary sampling occasions proved a fairly precise index, with a median CV of 10%, ranging between 6% and 14%.

Results
The correlation between ln(MNC) and ln(MR) proved positive and highly significant (r  0.89; p  0.003). The linear model confirmed the strong positive relationship between the two variables (b  1.123; SE  0.234; t-value  4.792; p  0.003) (Fig. 4a) and the two-sided Z-test proved that the slope of the regression was not different to 1 (p  0.600); the Kolmogorov-Smirnov test on the residuals had p  0.528, thus confirming the goodness-of-fit of the model. Similarly, the correlation between ln(ANC) and ln(MR) was positive and highly significant (r  0.95; p  0.001) and the linear model showed a strong positive relationship between the two variables (b  1.048; SE  0.147; t-value  7.131; p  0.001) (Fig. 4b). The Kolmogorov-Smirnov test on the residuals suggested adequate model fitting (p  0.071; although the p-value is close to the significance level, linear regression is fairly robust to slight deviations from normality), and the slope of the regression was not different to 1 (two-sided Z-test: p  0.744).

Reliability of spotlight-based indices
We used the point estimates obtained with mark-resight (MR) to investigate the performance of indices derived from spring spotlighting. Because several indices of relative abundance can be extracted from spotlight counts (Aubry et al. 2012), we first need to define them explicitly. As the number of km driven per count was kept constant over the years, counts were always conducted with good visibility conditions and because of potential inconsistencies in the definition of 'group' (e.g. different operators may have recorded differently the number of groups when deer were unevenly dispersed on a meadow), we chose not to use the kilometric indices of Garel et al. (2010) corrected for transect length and visibility conditions. Instead, we used the maximum number of deer (marked and unmarked) counted during one of the four secondary occasions within each year (MNC) and the average number of deer counted (marked and unmarked) during all the four secondary occasions within each year (ANC). Provided counts are carried out consistently over time, we believe that MNC and ANC represent fairly straightforward and intuitive indices for coarse-scale management programs.
To answer question 1), i.e. to check for trend similarities between MNC or ANC and MR estimates, following Loison et al. (2006) and Garel et al. (2010) we first logtransformed yearly values of MNC, ANC and MR. After checking for normality of data by means of a Kolmogorov-Smirnov test, we initially ran a Pearson's correlation test between ln(MNC) and ln(MR) and between ln(ANC) and ln(MR). We then used ln(MNC) or ln(ANC) as response variables and ran two separate linear models using ln(MR) as a predictor, to test the significance of the regression slopes. To answer question 2), i.e. if abundance indices and mark-resight estimates change at the same rate over time (slopes  1, for example, suggest a saturation effect: Caughley 1977), we used the Z-test to check if beta values of the linear models were significantly different from 1. To answer question 3), i.e. to investigate the minimum number of counts needed every spring to reliably track population size variations, we repeated the same analyses (except for the Pearson's correlation test) using different values of ln(MNC) and ln(ANC) calculated with 10 different subsets of secondary sampling occasions (i.e. four combinations of Table 1. Results of the model selection on the 10 mark-resight models fitted to investigate population size of red deer in the study site Valfurva, within the Stelvio National Park, using spring spotlight data between 2008 and 2015. For each model, the table reports values of Akaike's information criterion corrected for small sample size (AICc), differences in AICc (ΔAICc) between each model and the model with the lowest AICc, the Akaike's weights (Weight) and number of parameters (Num. par.). Selected models are shown in bold. For each model, '' and '' indicate additive and interactive effects, respectively.

Model
AICc ΔAICc Weight Num. par. When adopting probabilistic methods to account for non-perfect detection, it is essential to carefully evaluate the fulfilment of the assumptions underlying the used estimator. We believe that the closure assumption within each year was met in our study, as each primary sampling occasion j was conducted over a relatively short timeframe (about two weeks) during which events of mortality, immigration or emigration could be considered negligible. The absence of recorded mortality events in marked individuals and the spatial data shown in Fig. 2 support this assumption. Likewise, we believe that the short timeframe and the use of collars allowed avoiding mark loss within primary occasions. The presence of reflectors on collars and ear tags ensured the possibility to avoid errors in distinguishing between marked and unmarked individuals, after the animals were spotted. At the same time, we believe that the use of reflectors did not alter the sightability of marked individuals after capture: typically, during spotlight counts, deer were first detected thanks to the eyeshine effect of the tapetum lucidum, while the presence of reflectors was detected only at a later point in time (normally a few seconds). The unaltered chance of been sighted after marking and the even distribution of capturing effort between the two sexes and among survey areas support the assumption that the sample of marks was representative of the study population. Red deer show a tendency to form large groups in springtime, hence the assumption of independently and identically distributed resighting probabilities was likely violated in space, possibly leading to a contagion amongst resightings and, in turn, to narrower confidence intervals in the estimates (Fattorini et al. 2007). Nonetheless, in this study we were primarily interested in the over-time variation in size estimates, and as long as the bias induced by non-independence of resightings can be considered consistent over time, this should not represent a major issue in the investigation of the robustness of indices derived from spotlight surveys in detecting the temporal trend of population size. Finally, the choice of a sampling design that provides the highest resighting probability is desirable in capture-recapture studies (Lindberg 2012), and we are confident that conducting spotlight counts during the green-up period ensured the maximization of encounter rates of red deer in the study site.
Our results are in sharp contrast with those of Collier et al. (2013), who found spotlight counts an unreliable All the linear models relating ln(MR) estimates and ln(MNC) or ln(ANC) values obtained using different combinations of three secondary sampling occasions showed strong positive relationships between the variables, and slopes not significantly different to 1 (Table 3). On the contrary, at least one of the ln(MNC) and ln(ANC) values obtained using two secondary sampling occasions did not show significant relationship with ln(MR) estimates (Table 3).

Discussion
Our results suggest that both the maximum and the average number of deer counted during spring spotlight surveys can be considered as reliable indices to detect temporal variations in abundance in our study population, provided that at least three repeated counts are carried out in each annual survey. Furthermore, the indices derived from spotlight counts may be used to estimate population growth rates and thus investigate the responsiveness of the study population to variations of environmental factors.  in principle our sampling design violated the assumption of random distribution of transects, we suggest that landscape features -especially in terms of distribution of food resources -may largely impact on this assumption and thus on the possibility to use spotlight counts to provide reliable indices of relative abundance. While Collier et al. (2013) conducted their investigation in plain, highly forested (ca 93%) costal areas, our study was carried out in a highly heterogeneous mountainous landscape. In temperate mountainous landscapes the vegetation green-up largely depends upon the interaction between elevation and season (Pettorelli et al. 2007) and attractive food resources in alternative to monitor over-time variations in population size of white-tailed deer. On the contrary, we support the suggestion of Garel et al. (2010) that spotlighting may be a valuable method to detect the population trend of red deer. Because spotlight counts are carried out along roads, Collier et al. (2013) suggested that the performance of their surveys was hampered by the lack of a random placement of transects, that eventually led to low and highly variable detection probabilities. Indeed, animals' response to roads is widely recognized (Yost andWright 2001, Marsh andBeckman 2004) and it may cause bias in the estimation of animal abundance (Marques et al. 2010   to investigate the dynamics of red deer populations inhabiting similar habitats. The reliability of population indices to monitor trends in abundance has been widely debated (Anderson 2001, Engeman 2003, and spotlight-derived indices are no exception (Collier et al. 2007(Collier et al. , 2013. When the basic assumption of constant detection probability is satisfied, roadside spotlight-counts may provide appropriate indices for the study and management of wildlife populations. While this assumption may be difficult to meet in reality, our data suggest that carrying out standardized spotlight counts in mountainous landscapes in appropriate periods may provide valuable tools for the management of mountain-dwelling red deer. While the MNC provides indications as to the minimum number of individuals alive in the population, the ANC provides a measure of precision of the index, a highly desirable property for a monitoring tool. It should be noted, however, that while four spotlight occasions proved a valid sampling strategy in this specific case study, wildlife managers and researchers aiming to use spotlight counts to track changes in deer population size ought to make some tests to find out the optimal number of occasions specific to their study area. springtime are typically constrained to open areas at low elevations. This patchy distribution tends to exert a spatial forcing on red deer, whose spring movements are unconstrained by the presence of roads (the peak of deer-vehicle accidents observed in this season supports our suggestion, cf. Steiner et al. 2014), eventually removing the negative bias induced by the opportunistic placement of transects and thus leading to increased and fairly consistent detection probabilities during roadside counts. In fact, the mean resighting probability in our population was higher than that found by Collier et al. (2013) (0.51 versus 0.41) and the model selection procedure suggested constant detection probabilities over time. The role of landscape heterogeneity appears supported by the similar results obtained by Garel et al. (2010), who also conducted their surveys in a mountainous landscape, albeit at much lower elevations.
From the management standpoint we suggest that, for red deer populations living in similar habitats, both the maximum and the average number of deer counted during at least three occasions of spring spotlight counts can be used as reliable indices to track changes in deer abundance. Our results are not surprising: recently, Ahrestani et al. (2013) showed that observation errors in Cervus are generally lower than in other ungulates, and that repeated ground-based counts usually have the lowest process error. However, because high and fairly consistent detection probabilities are crucial for the reliability of these population indices (cf. Collier et al. 2013), wildlife managers are expected to appropriately schedule spotlight counts following the starting of the green-up period and complete multiple occasions of spotlighting in a short timeframe (e.g. two weeks) to ensure high detection probabilities and meet the closure assumption. Further, the sampling protocol should be kept as consistent as possible in terms of sampled routes, operators and visibility conditions to favour the assumption of constant detection probability, although in practice this rarely occurs (Pollock et al. 2002); the lack of standardization may be controlled for with the inclusion of covariates in the analysis (Garel et al. 2010). In addition, the size and boundaries of survey areas ought to be carefully evaluated based on the knowledge of the seasonal distribution of deer at the population level.
Our results suggest that the maximum and the average number of deer counted during spring spotlight surveys can also be used as proxies to model temporal variations in population growth rates, thus allowing the investigation of deer population dynamics in similar habitats. The lack of saturation effect, compared to what found by Garel et al. (2010), might be due to higher detectability rates in our study site, possibly owing to reduced forest cover. While both the MNC and the ANC based on three or four secondary sampling occasions showed significant positive relationships with the mark-resight estimates, the standard errors in the linear models suggested consistently higher precision for the beta values using ANC, likely because average values are less subject to sampling variability compared to maximum values. Furthermore, the beta estimates for MNC and ANC obtained with three secondary sampling occasions showed larger standard errors and greater saturation effect, compared to the indices obtained with four occasions. In turn we suggest that the use of ANC derived from four sessions of spring spotlight counts are better suited