How many feces should be sampled from latrines? Spatial sampling biases affecting the dietary analysis of island raccoon dogs

Biased sampling could affect the results due to the pseudoreplication from the same animal and the spatial heterogeneity of food distribution although sampling methods are not always well discussed in studies of fecal analysis for animal food habits. We investigated the effects of biased sampling in sample size, collection site and its surrounding environment on the fecal analysis using the point frame (%PF) and the frequency of occurrence (%FO) methods of island raccoon dogs Nyctereutes procyonoides, which are opportunistic in food habits and defecate at fixed latrines. Our analyses showed that when the sample size was <30 and <50 in %PF and %FO, respectively, a significant bias was observed, and if the fecal sampling environment was restricted to the inland area, a significant bias occurred even if the sample size was <50 and <70 in %PF and %FO, respectively. If the sampling point was restricted to a specific latrine or the seashore, a significant bias in the dietary analysis could not be eliminated even if the sample size was artificially increased to 100. To avoid biases, spatially biased sampling to collect many feces from a specific latrine should be avoided. It seemed necessary to collect ≥30 and ≥40 fecal samples in %PF and %FO, respectively, from different latrines.

Studies on food habits provide the most basic and fundamental information on various biological aspects of a target species, providing insights into community stability (Prugh 2005), disease infection risk (Tsukada et al. 2014), and an aid for establishing an appropriate conservation strategy (Murakami 2003, Woodroffe et al. 2005, De Azevedo 2008. In dietary studies of carnivores, fecal analysis is frequently used to identify the consumed food remains and is one of the most popular methods for determining food items that were consumed (Putman 1984, Fukue et al. 2011, Klare et al. 2011. Fecal samples are easily and noninvasively acquired to meet a sample size sufficient for statistical analysis (Litvaitis 2000, Nilsen et al. 2012. However, fecal analysis could be biased due to by various factors. Previous studies have reported that the outputs of dietary analysis are affected by sampling biases from specific sites, such as kills (Marucco et al. 2008), pup-rearing home sites and clusters of GPS locations (Gable et al. 2017) as well as by laboratory and data processing (Corbett 1989, Reynolds and Aebischer 1991, Fedriani and Travaini 2000, Zabala and Zuberogoitia 2003, Migli et al. 2005, Klare et al. 2011, Takatsuki et al. 2015.
Sampling methods are unlikely to be discussed in studies on fecal dietary analysis. Some studies have revealed that spatially biased sampling of feces can affect the result of dietary analysis in wolves Canis lupus (Marucco et al. 2008, Steenweg et al. 2015, Gable et al. 2017. Ideally, fecal samples should represent the entire fecal contents of the population in the study area using a randomized sampling design. Particularly, in some carnivore species defecating restrictedly at latrines, such as the raccoon dog Nyctereutes procyonoides (Ikeda 1984, Yamamoto 1984 and several badger species Meles spp. (Kruuk 1978, Yamamoto 1991, Zagainova and Markov 2011, the feces sampling design is important to avoid such spatially skewed bias on the dietary analysis. However, the possibility of such sampling bias has been overlooked in previous studies on food habits among carnivore species that form latrines. In raccoon dogs, latrines consist of accumulated feces and are used communally by several individuals (Ikeda 1984, Koizumi et al. 2017. Therefore, if several latrines are found, many fecal samples can be easily acquired by collecting multiple feces samples from each latrine. This differs from other animal species, which defecate a single scat at each site. In contrast, there is an increased risk of pseudoreplication, in which the feces of the same raccoon dog are collected from the latrines that the particular individual prefers to use. One way to overcome these problems is to use the recently developed microsatellite analysis of fecal DNA for individual identification (Matsuki et al. 2006, Saito et al. 2016. However, the reserved condition of the feces varies in the field because environmental conditions such as sunshine and dryness vary, and the feces can be decomposed by the effects of rain and dung beetles. As a result, the preservation state of the DNA in the feces also varies (Masuda et al. 2009). Therefore, depending on the condition of the feces, DNA extraction may not be possible, and the efficiency of extracting DNA decreases to 87.7% (n = 8; Matsuki et al. 2006) at best or decreases up to 67.3% (n = 101; Saito et al. 2016) at worst. Furthermore, such analyses are laborious and costly and are currently unsuitable for processing many fecal samples.
The effect of pseudoreplication on the collection of feces from latrines is more problematic when food habits differ vastly among individual raccoon dogs or when the number of raccoon dogs sharing a latrine varies significantly among latrines. With regard to the former, individual differences in food habits are known in several carnivore species (Gese et al. 1996, Molsher et al. 2000, Estes et al. 2003, Prugh et al. 2008, although these are not known in the raccoon dog. In the latter case, however, the use of latrines by raccoon dogs varies among latrines and can be influenced by season (Ikeda 1984, Teduka and Endo 2005, Koizumi et al. 2017, Tsunoda et al. 2019) and human disturbance (Tsunoda et al. 2019). Considering these effects in an actual field situation in which feces is collected, fecal analysis could be influenced by the number of fecal samples collected from each latrine or from which latrines the samples are collected. Furthermore, the sample size of the collected feces itself affects the fecal dietary analysis (Windberg and Mitchell 1990, Reynolds and Aebischer 1991, Mukherjee et al. 1994, Trites and Joy 2005, so it would be necessary to simultaneously consider both effects of spatial bias and sample size to avoid pseudoreplication. In this study, we investigated the effect of fecal sample size, fecal collection sites and their contrasting surrounding environments (seashore area and inland area) on the results of fecal dietary analysis of the raccoon dog that excrete feces in several latrines.

Study area
This study was conducted at Izushima, an isolated island with an area of 2.63 km 2 , a coastline length of 14 km, and a maximum elevation of 87.6 m. This island is located at a minimum distance of 300 m off the coast of Oshika Peninsula, Miyagi Prefecture, Japan and has approximately 100 households and 164 residents. At the island, there are two harbors where the fishing of sea urchins, inshore fishes and octopuses as well as cultivation of sea squirts, scallops, oysters and salmons is conducted. The dominant vegetation of the island includes Pinus densiflora, Quercus serrata, Castanea crenata, Cryptomeria japonica and Chamaecyparis obtusa plantations cohabiting Euonymus japonicus and Pittosporum tobira in the coastal area and Illicium anisatu and Abies firma in the northern and shaded areas. Additionally, there are areas of Machilus thunbergii vegetation and logging areas as well as Camellia japonica and Eurya japonica growth on the ruins of residential buildings. Despite the small area, a relatively large number of carnivores inhabit this island: raccoon dogs, red foxes Vulpes vulpes, Japanese martens Martes melampus, Japanese weasels Mustela itatsi, masked palm civets Paguma larvata and feral cats Felis catus (Ono 1992, Tsukada et al. unpubl.). In addition, small mammals such as Japanese squirrel Sciurus lis and large Japanese field mice Apodemus speciosus, various birds such as Hypsietes amaurotis, Syrmaticus soemmerringii and Falco peregrinus, reptiles, amphibians and insects such as Coreoptera, Orthoptera, Hemiptera and Hymenoptera inhabit the island.

Fecal sampling
Fecal samples of raccoon dogs were collected from 43 out of 62 latrines previously found during late April and early May in 2018. Only when the latrines contained new scats, each of the new scat samples was collected separately and regarded as one fecal sample. If there were many new scats in one latrine, as many fecal samples as possible were collected. The collected samples were placed in 50-ml plastic bottles or plastic chuck bags and stored at −20°C for dietary analysis.

Dietary analysis
Fecal analysis was conducted using the point frame (PF) method (Takatsuki et al. 2007). The frozen fecal samples were thawed and washed with a 0.5-mm mesh sieve with tap water, and the undigested food items remaining on the sieve were stored in 70% ethanol and subjected to the following analysis. Following the method of Takatsuki et al. (2007), water was placed on a glass slide with a 20 × 5-mm metal frame and 1-mm grid, the residues were spread on the slide, and the number of grid intersection points covered by the material was counted under a microscope up to a total of 200 points for each sample. When there were undigested products beyond the size of a glass slide, such as seeds and artifacts, the following procedure was performed. The items were spread on a petri dish with 5-mm-spaced grids, and the number of intersection points on the petri dish covered with the items were counted as a 'pre-count'. Then, the rest of the undigested items were analyzed in the same manner as the residue not regarded as 'pre-count'. The undigested items were classified first into 35 small food categories and then were summarized into seven large food categories (Supplementary material Appendix 1 SD 1 ). To compare the food composition, the frequency of occurrence (%FO) and the proportion of the PF value (%PF) of each item were calculated. %PF was calculated using the following formula: %PF = ∑PFi/n, where PFi is the PF value of the food item i divided by 200 and n is the number of fecal samples examined. When pre-counting was performed, the following correction equation was used: PFj′ = PCj/(PCj + PCz) + PFj/200 × PCz/ (PCj + PCz), where PCj is the pre-counting value of the food item j and PCz is the pre-counting value of the rest of the undigested items excluding j.

Estimating sampling method
To assess the effect of the following three fecal sampling methods on the results of the dietary analysis, three artificial datasets with different fecal sampling strategies were prepared. Assuming that all fecal samples reflect the food habits of raccoon dogs in the study area, which are regarded as the standard dataset, the result of the fecal analysis from the standard dataset was considered as the benchmark. The occupancy rate (%PF) of each food category and the %FO of each food category were calculated. Then, different artificial datasets were created in the following three ways and compared to the standard dataset.
The first sampling strategy was collection of various sampling sizes by randomized sampling from the whole fecal sample. Trites and Joy (2005) examined the effect of sample size on fecal dietary analysis using Monte Carlo simulation analysis and recommended the collection of >94 samples to compare diets with moderate statistical power over time or between areas. To assess the effect of sample size on the results of the fecal analysis, the results from sub-datasets with various sample sizes were compared to that of the standard dataset.
The second sampling strategy was collection of fecal samples from spatially skewed locations where specific food resources are localized. If sampling is performed without considering the distribution of food resources, such spatially biased samples can skew the result of dietary analysis. As the present study area is an island, maritime food resources are available along the seashore. Therefore, differences in available food resources between the seashore and the inland area can be compared.
The third sampling strategy was collection of fecal samples from specific latrines. The distribution of the home range of raccoon dogs is relatively stable, mutually overlapping between individuals, although the size varies depending on the environment and the season (Koizumi et al. 2017). A raccoon dog has several latrines shared with other individuals, and several scats accumulate in each latrine (Ikeda 1984). We created a sub-dataset in which only one scat was collected from each latrine and other sub-datasets in which fecal samples with various sizes were resampled from each specific latrine. The results of the fecal dietary analysis of these subdatasets were compared with that of the standard dataset.
The details of creating and analyzing the artificial subdatasets by the different sampling methods are as follows: Sub-dataset with various sample sizes. Resampling without replacement was performed 100 000 times from the standard data using the function 'sample_n' of the package 'dplyr' of R (< www.r-project.org >). Any random numbers from 1 to 100 were generated, and a total of 100 datasets of specific sample sizes (2-10, 15, 20, 30, 40, 50, 60, 70, 80, 90 and 100) were extracted at the start point of the random numbers from the 100 000 sub-datasets. These analyses were performed using MS Excel. The occupancy rate (%PF) of seven dietary categories by the PF method and the frequency of occurrence (%FO) of these dietary categories were calculated for each 100 sub-datasets of each sample size, and a χ 2 -test was performed at the 5% significance level with the standard dataset. The rate of the statistical significance (the number of statistically significant cases out of 100 samples) was calculated for each sample size and defined as the detection rate of the statistical significance. These processes were repeated 50 times, and the average value of the detection rate of the statistical significance and its 97.5 and 2.5 percentile values were calculated. Based on the 97.5 percentile value exceeding 5%, the minimum sample size required to avoid the risk of an obvious error was estimated.
Sub-datasets of latrines along the seashore and in inland areas. Using ArcGIS ver. 10.6.1, a 100-m-wide buffer strip along the seashore was generated, and the farther buffer strip were created up to 300 m from the seashore. Based on these buffers, two divisions of 1) within 100 m from the seashore and 2) ≥300 m inland from the seashore were defined and then the latrines belonging to each division, 1) and 2), were regarded as 'seashore' and 'inland', respectively (Fig. 1). In the same manner as 1), 100 sub-datasets of different sample sizes were created, and the average value and 97.5 and 2.5 percentile values of the detection ratio of the statistical significance between the standard dataset were calculated.
Sub-datasets of specific latrines and of all latrines. To assess the impact of biased sampling from specific latrines, 1) a sub-dataset of only one random fecal sample from all latrines and 2) the top five latrines with many fecal samples per latrine (all fresh fecal samples were collected, resulting in 6-8 samples in each latrine) were extracted, and five subdatasets were prepared in which any fecal samples were resampled from each top five latrines. In the same manner as 1), 100 sub-datasets of different sample sizes were created, and the average value and 97.5 and 2.5 percentile values of the detection ratio of the statistical significance between the standard dataset were calculated. The locations of the top five latrines are shown in Fig. 1. The distances between these latrines calculated using ArcGIS are shown in Table 1. The averaged home range of raccoon dogs in Japan by MCP for 10-610 ha (Mitsuhashi et al. 2018) and bait-marking method for 2.79 ha (up to 4.3 ha) on Takashima Islet (18.7 ha in size), Kyushu (Ikeda et al. 1979). Kubo et al. (2019) have analyzed the microsatellite regions of the fecal DNA of the raccoon dog and estimated that there were 35 individuals and 9 pairs of raccoon dogs inhabiting the study area. Based on this estimation, assuming that the average home range of a pair of raccoon dog has a circumference of 29.6 ha, the diameter is about 615.7 m, and the average distance between five latrines (755.4 ± 317.3 SD) was similar to this distance. Therefore, although the home ranges of raccoon dogs have not been surveyed yet in the study area, it is likely that these five latrines were used by different pairs or families.

Statistical analyses
To compare the results of the fecal analysis of sub-datasets by the three sampling methods to that of the standard dataset, a multiple comparison test by the Steel method was performed in %PF of each food item (Aoki 2004), and Fisher's exact test was performed in %FO of each food item. All these statistical analyses were conducted using R ver. 3.4.3 (< www.rproject.org >).

Results
A total of 128 feces of raccoon dogs were collected from 43 latrines as the standard dataset ( Table 2). The results of the fecal analysis of the standard dataset calculated by %PF and %FO of each food item are shown in Table 3. In %PF, insects accounted for the highest proportion followed by plant matter, others, marine organisms and fruits. In %FO, plant matter appeared in all samples and showed the highest Figure 1. The study island is located at a minimum of 300 m off the coast of Oshika Peninsula, Miyagi Prefecture, Japan. Dark and light shaded areas indicate the seashore area with a 100-m wide inner strip along the seashore line and the inland area with >300 m apart from the seashore, respectively. Small black filled dots indicate the latrines of raccoon dogs. Circled latrines indicate the top five latrines, including the maximum number of fecal samples collected (all fresh fecal samples were collected, with 6-8 in each latrine). Each number indicates latrine no. in Table 2. Table 1 The distance (m) between top 5 latrines with many fecal samples per latrine in Izushima island, Miyagi Prefecture, Japan. L7  L21  L40  L64  LDousoshin   L7  -865  554  649  1418  L21  --550  745  690  L40  ---216  902  L64  ----965  LDousoshin -----value followed by insects, others, fruits, marine organisms and birds. The average value of the detection rate of the statistical significance and its 97.5 and 2.5 percentile values were calculated with various sample sizes (2-10, 15, 20, 30, 40, 50, 60, 70, 80, 90 and 100) and are shown in Fig. 2. When the sample size of the sub-dataset was ≥30 in %PF and ≥50 in %FO, respectively, the 97.5 percentile value of the detection rate of the statistically significant difference was <5%.

Latrine ID
The sub-datasets 'seashore' and 'inland' included 24 fecal samples from 9 latrines and 25 fecal samples from 12 latrines, respectively. The %PF and %FO in each food item and each sub-dataset are shown in Table 4. In the 'seashore' sub-dataset, both the %PF (15.8%) and the %FO (79.2%) of marine organisms were higher than those in the 'inland' sub-dataset (%PF = 10.0% and %FO = 60.0%), whereas the %PF (18.8%) of the plants showed the opposite (%PF = 23.4% in inland). However, %PF of marine organisms and other food items in each sub-dataset were not significantly different from those in the standard dataset by multiple comparison test (Steel method, p > 0.05). Furthermore, the %FO of all food items in each sub-dataset was also not significantly different from those in the standard dataset (Fisher's exact test, p > 0.05). The average value of the detection rate of the statistical significance and its 97.5 and 2.5 percentile values in each sub-dataset were calculated with various sample sizes (2-10, 15, 20, 30, 40, 50, 60, 70, 80, 90 and 100) and are shown in Fig. 3A-B. In the 'seashore' sub-dataset, both the 97.5 and 2.5 percentile value of the detection rate of the statistically significant difference was always ≥5% in both %PF and %FO even when the sample size of the sub-dataset was increased to 100 (Fig. 3A). In the 'inland' sub-dataset, the 97.5 percentile value of the detection rate of the statistically significant difference was also <5% when the sample size of the sub-dataset was ≥50 and ≥70 in %PF and %FO, respectively (Fig. 3B).
The %PF and %FO of each food item of a sub-dataset of one random fecal sample from all latrines and five subdatasets from each top five latrines with many fecal samples per latrine are shown in Table 5. Although the %PF and %FO of feces of marine organisms and birds varied among the sub-datasets, no statistically significant difference was found in the %PF between these sub-datasets and the standard dataset (multiple comparison test by the Steel method,   Figure 2. Average detection rate of the statistical significance in the fecal dietary analysis in %PF and %FO between the sub-dataset and the standard dataset and its 97.5 and 2.5 percentile values with various sample sizes of sub-datasets. The x-and y-axes show the sample size of the sub-dataset and the average detection rate of the statistical significance between the sub-dataset and standard dataset, respectively. Error bars indicate the 97.5 and 2.5 percentile values of the sub-dataset. The arrow indicates the detection rate of the statistical significance in %PF and %FO, respectively, between the sub-dataset and the standard dataset <5%. p > 0.05). Furthermore, %FO in each food item was not significantly different between these sub-datasets and the standard dataset (Fisher's exact test, p > 0.05). The average value of the detection rate of the statistical significance and its 97.5 and 2.5 percentile values in each sub-dataset were calculated with various sample sizes (2-10, 15, 20, 30, 40, 50, 60, 70, 80, 90 and 100) and are shown in Fig. 4A-F. In the sub-dataset of one random fecal sample from all latrines, the 97.5 percentile value of the detection rate of the statistically significant difference was <5% when the sample size of the sub-dataset was ≥30 and ≥40 in %PF and %FO, respectively (Fig. 4A). In contrast, in four of the five subdatasets from each top five latrines with many fecal samples per latrine, both the 97.5 and 2.5 percentile values of the detection rate of the statistically significant difference were always ≥5% even when the sample size was increased to 100 ( Fig. 4B-E). In the rest of the sub-datasets from each top five latrines with many fecal samples per latrine, the 97.5 percentile value of the detection rate of the statistically significant difference was <5% when the sample size of the sub-dataset was ≥20 (Fig. 4F).

Discussion
The bias of the fecal dietary analysis in the sub-datasets of different sampling strategies revealed that the statistically significant difference from the standard dataset was the largest when the feces were sampled only from the specific latrine.
In four of the top five latrines with many fecal samples per latrine, the statistically significant bias of the fecal dietary analysis could not be adjusted even when the sample size artificially increased up to 100, suggesting that a spatially skewed sampling has a strong influence in the fecal analysis. Moreover, in the case of fecal sampling from a specific latrine, the individual differences of food habits in raccoon dogs may have affected the results of the fecal analysis because several scats of a particular individual would be repeatedly sampled and be regarded as independent samples. Although individual differences in food habits have not been reported in raccoon dogs to our knowledge, it is known in other carnivores, such as coyotes, red foxes and sea otters (Gese et al. 1996, Molsher et al. 2000, Estes et al. 2003, Prugh et al. 2008). Particularly, Prugh et al. (2008) reported that fecal genotyping can determine individual dietary differences among coyotes, and such a new technique will be useful to examine the effects of individual differences on the dietary analysis of raccoon dogs in the future. When one fecal sample was sampled from each of the latrines, the statistically significant bias was not confirmed unless the total sample size was reduced to <30 and <40 in %PF and %FO, respectively. Therefore, as a practical method to avoid sampling bias in dietary analysis, we recommend sampling several scats evenly from multiple latrines of raccoon dogs. This study clearly shows that when fecal sampling is restricted spatially in close areas where specific food resources are available, such as seashores, this spatial bias in sampling affects the results of the dietary analysis when the sample size is limited. The raccoon dog is omnivorous and opportunistic in food habits (Saeki 2008, Sutor et al. 2010, Mulder 2012. Therefore, if available food resources have skewed spatial distribution, the food habit of each raccoon dog would also differ depending on the distribution of each raccoon dog's home range. Sampling of feces near the seashore would influence the result of the dietary analysis because the fecal sample can include many fecal samples of raccoon dogs that consume marine organisms. In fact, the food habits of raccoon dogs near the seashore in this study were characterized by the presence of marine organisms with 15.8% and 79.2% in %PF and %FO, respectively. Previous studies in coastal environments have also confirmed that raccoon dogs Table 4 The results of the fecal dietary analysis in the sub-datasets of 'Seasore' and 'Inland' of raccoon dogs in Izushima island, Miyagi Prefecture, Japan.  consume marine organisms (Ikeda et al. 1979, Kauhala andIhalainen 2014). In addition, it has been reported that the diets of raccoon dogs can vary flexibly depending on the habitat conditions (Kauhala and Auniola 2001, Matsuyama et al. 2006, Sidorovich et al. 2008, Sutor et al. 2010).
In the case of sampling feces only from the inland area, a significant bias of fecal dietary analysis was found only when the sample size was <50 and <70 in %PF and %FO, respectively. These impacts of sample size on the dietary analyses were milder than when sampling feces was restricted only from a certain latrine. In the sub-dataset, in which feces were sampled exclusively from the inland, fecal samples were collected from up to 12 different latrines containing feces excreted by several raccoon dogs, and the fecal sampling became unbiased, which is no different from the standard dataset. The latrines of raccoon dogs are mutually shared by several individuals (Ikeda 1984). Therefore, the finding that sampling feces from the inland area, which seems to be the representative habits in the study area, had less spatially skewed distribution in contrast to sampling feces from a specific latrine could be ascribed to the mitigation of the bias caused by repeated counting of a particular raccoon dog's feces.
In contrast, the influence of the sample size on the result of the fecal dietary analysis was not as strong as that caused by the spatially biased sampling. In the case of feces randomly sampled, a significant bias in the results of the dietary analysis was observed only when the sample size was <30 and <50 in %PF and %FO, respectively. In addition, even when only one fecal sample was collected from each latrine, a significant bias in the result of the fecal dietary analysis was not confirmed unless the total sample size was reduced to <30 and <40 in %PF and %FO, respectively. These results suggested that the effects of spatially biased sampling on the results of the fecal dietary analysis can be largely avoided by preventing sampling from a specific latrine, but ≥30 and ≥50 samples in %PF and %FO, respectively, are necessary to avoid a spatial bias. In this study, although the required sample size for unbiased estimation, such as power analysis, was not conducted, poor estimation of the analysis including biases can be avoided by collecting ≥30 and ≥40 fecal samples from different latrines in %PF and %FO, respectively. To determine an effective sample size, Trites and Joy (2005) conducted a simulation-based analysis in Steller sea lions Eynetiouas jubatus and showed that 59 fecal samples were required to identify a major food resource in ≥5% feces and an additional 94 feces were required in a more heterogeneous environment. In addition, in a study examining the food habits of leopard Panthera pardus, >80 fecal samples were required for sufficient analysis (Mukherjee et al. 1994). Furthermore, a study of coyotes showed that it is desirable to collect at least 50 fecal samples for an adequate estimation of their diets (Windberg and Mitchell 1990). Although these previous studies recommended a slightly larger sample size than the present study, these results are almost comparable to ours given that the present study was conducted for a closed population on a small island where environmental variations are limited, in a species forming latrines spatially clumped but shared by several individuals, and within a short sampling season, i.e. spring. Therefore, when applying our results to regions other than islands and to multiple seasons, it can be safer to overestimate than the recommended sample size.

Conclusion/recommendations
To perform unbiased fecal dietary analysis, spatially biased sampling should be avoided; particularly, sampling feces from a specific latrine could lead to a seriously biased fecal dietary analysis. As a practical criterion, spatially independent sampling from <10 latrines may be a high risk for unbiased sampling, whereas sampling ≥30 and ≥50 feces in % PF and %FO, respectively, from ≥12 latrines can be safe. It is desirable to collect ≥30 and ≥40 fecal samples from different latrines in %PF and %FO, respectively.