Ontario black bear genotypes and locations (2017-2019)
Data files
Apr 21, 2023 version files 5.35 MB
-
All_bears_genetic_profiles_March_25_2020_Oikos.csv
558.15 KB
-
bears_genotype_trap_locations_April_28_2020_Oikos.csv
4.78 MB
-
Model_Covariate_Data_Oikos.csv
7.10 KB
-
README.md
2.92 KB
Abstract
Characterizing patterns and drivers of dispersal is fundamental to our understanding of animal ecology and ultimately informing species conservation and management strategies. In this study, we used microsatellite data from 3,941 individual black bears, Ursus americanus, occupying 73 spatially distinct sampling areas across a large heterogeneous landscape to characterize dispersal via gene flow directionality. We fit spatial models to quantified gene flow to test hypotheses regarding drivers of putative dispersal patterns. Specifically, we tested the relative influence of food productivity gradients, bear density, and bear harvest on dispersal. We also evaluated differences in gene flow patterns within and between sexes to assess sex-biased dispersal. We found evidence suggestive of positive density-dependent, male-biased dispersal. Our data show evidence of a relationship between dispersal and broad food productivity gradients. Specifically, male bears displayed preferential dispersal towards mixed deciduous forests with higher food productivity relative to less productive boreal forests. Given the dense sampling scheme across a continuous population, occupying a large heterogeneous landscape, these results provide key insight as to the likely drivers of dispersal patterns in a wide-ranging mammal.
Sample Collection
Genetic samples were collected as a part of the Ontario Ministry of Natural Resources, and Forestry’s (MNRF) ongoing black bear monitoring program that monitors black bear populations across the province using non-invasive genetic sampling (Howe et al., 2013, 2022). Samples were collected from 73 arrays of approximately 40 unique sample locations (n = 2,952 total sample locations). Sample locations were selected by local MNRF staff to be representative of the landscape in the broader wildlife management unit (WMU; areas used to manage bears in Ontario; Howe et al., 2022). Individual sampling arrays were separated by a minimum of 25km. Although annual black bear ranges can vary from five to 250km2 (Tri, 2013), it was unlikely that male bears would be detected at arrays greater than 25km apart during our sampling period in May and early June. During this time, black bear ranges are typically smaller (e.g., <25km; Humm et al., 2017; Noyce & Garshelis, 2011) before they expand in the summer and fall when bears undertake seasonal food forays (Humm et al., 2017, Noyce & Garshelis, 2011; Obbard et al., 2017). Although it is possible male black bears were making greater distance movements during our sampling period, we found that <0.01% of genotypes were detected at multiple arrays. Individual sample locations were spaced roughly 1.5km apart along secondary roads (for accessibility). These locations consisted of barbed wire hair corrals (Woods et al., 1999) where corrals were constructed with a single strand of barbed wire strung around a series of trees at 50cm above the ground creating a corral (Howe et al., 2013). Wire height was considered sufficient to exclude sampling bears < 2 years of age that are mostly shorter than this height based on morphometric data from this region (MNRF unpublished data). Although it is possible a small number of bears < 2 years of age were sampled, it is unlikely these samples would influence subsequent analyses at the scale they were conducted. A tree, central within the corral (< 1.5m from the barbed wire), was used to hang 3 cans of sardines to serve as bait; traps were rebaited weekly during sample collection. Black bears step over or crawl under barbed wire, snagging hair that is subsequently used for DNA extraction and individual identification. Sampling was conducted over 5 weekly sampling occasions in late May and early June in 2017, 2018, and 2019. Hairs that were snagged on barbed wire were collected into paper envelopes and air-dried after collection. Barbs were burned using a lighter to remove any remaining hair. Samples were stored at room temperature in-lab prior to processing.
DNA Extraction and Microsatellite Amplification
DNA extraction and microsatellite amplification was performed by the MNRF Wildlife Research and Monitoring Genetics Lab. A minimum of 5 (average 20) hairs per sample were used for DNA extraction. DNA extraction followed previous, noninvasive, black bear DNA sampling projects (Howe et al., 2022; Pelletier et al., 2017; see Supporting Information). Samples were genotyped at 15 microsatellite loci and amelogenin locus (for sex determination) previously used by Pelletier et al. (2017) and Howe et al. (2022) with modifications to optimize genotyping. Two microliters (uL) of stock DNA was used to amplify all loci using the Qiagen Multiplex PCR Kit in two 12µL multiplex reactions. Cycling conditions were as follows: 95°C 15min; 30-32 cycles of 94°C for 30s, TA for 90s, 72°C for 60s; and a final extension of 60°C for 45min. Reaction-specific cycling conditions are included in Supporting Information.
Broad Scale Genetic Clustering
We first assessed sex-biased dispersal by analyzing separate datasets containing all sampled individuals, only males, and only females. We assessed broad scale genetic structure using STRUCTURE v2.3.4 (Pritchard et al., 2000) and CLUMPAK (Kopelman et al., 2015). STRUCTURE is a Bayesian genetic clustering program used to determine the most likely number of genetic clusters (K), based on provided genotype data (Pritchard et al., 2000). STRUCTURE v2.3.4 implements an aspatial method that considers admixture – one genotype existing in multiple genetic clusters. STRUCTURE then calculates membership proportions (q) of each individual genotype to each inferred cluster (Falush et al., 2003).
We ran STRUCTURE using StrAuto and visualized results using STRUCTURE HARVESTER (Chhatre & Emerson, 2017). We ran 10 iterations of STRUCTURE at each of kmax (k being a possible value of K = 1-10 (100 total runs), with a burn in of 200 000 iterations, followed by 500 000 sampled MCMC iterations. We then used the ‘Best K’ feature of CLUMPAK to estimate the most likely number of genetic clusters (K), also known as ∆K described by Evanno et al. (2005). In instances where K=2 (all datasets), we performed post-hoc hierarchical STRUCTURE and ∆K analyses to assess if additional genetic structure existed within clusters identified in K=2 models. This is considered a best practice as ∆K analyses tend to identify K=2 as the most likely number of genetic clusters, failing to capture true genetic sub-structure. (Janes et al., 2017). After the most likely number of genetic clusters (K) was determined, we proceeded with STRUCTURE results of that K value using the main CLUMPAK pipeline to visualize population structure using DISTRUCT (Rosenberg, 2004). STRUCTURE plots were organized by geographic location (southeast to northwest). We then visualized STRUCTURE results spatially in ArcMap 10.7.1 using the interpolation tool (IDW). Data used for interpolation were membership proportion values (q) for the northwest genetic cluster for respective nodes.
To account for the influence of geographic location when assessing genetic admixture levels, we used the program tess3r, within the R statistical software (R Core Team, 2022), which assumes spatially proximate individuals are more genetically similar than individuals located farther away (Chen et al. 2006). Tess3r identifies genetic discontinuities in continuous populations and allows for visualization of genetic clusters that may be overestimated when using Bayesian clustering programs such as STRUCTURE (Corander & Marttinen, 2006). Tess3r implements a new version of the program TESS (Chen et al, 2006), based on geographically constrained matrix factorization and quadratic programming techniques, and is several orders faster than the Monte-Carlo algorithms implemented in previous versions (Caye et al., 2018). We ran tess3r as a post-hoc analysis for ancestral population numbers ranging from K=1 to K=5 using their projected least squares algorithm.
Characterizing and modelling dispersal patterns
To infer dispersal direction, we calculated genetic flux – a metric developed by Draheim et al. (2016) to measure net gene flow into, or out of, a particular area; in our case, arrays of traps. As dispersal is a required process preceding gene flow, calculating gene flow directionality in the form of genetic flux is a valuable means of assessing dispersal. This metric utilizes a combination of population genetic data and network theory. Networks are structures comprised of nodes and edges. Nodes are individual elements of the network (in this case single arrays), while edges represent the relationship between nodes (Minor & Urban, 2008). The concept of using networks has been applied in various studies to demonstrate population connectivity (Dyer et al., 2010; Koen et al., 2012; Rayfield et al., 2011). In these approaches, nodes are represented by mean pairwise genetic relatedness of individuals within a node, and edges are represented by mean pairwise relatedness among nodes. Although this approach is useful to assess population connectivity, it is limited in that does not provide directionality to observed gene flow. To address this, an extension of this approach was developed by Draheim et al. (2016) to calculate genetic flux, which provides information on the directionality of gene flow between nodes. Broadly, this approach assesses genetic relatedness within, relative to among, locations to make inference on underlying patterns of dispersal that are required for successful gene flow. In general, this approach assumes that nodes with more closely related individuals have higher outflow than nodes with relatively less closely related individuals, representing the idea that high levels of dispersal from external areas will lead to reduced overall average relatedness.
We followed the general approach of Draheim et al. (2016) and defined each array of traps as a node. We selected a trap near the middle of the array (trap 20) as the node centroid (Fig. 1). We measured the net flux into, and out of, each of these nodes. Net flux was calculated using the formula:
Fnet = Fin i – Fout i (1)
Where Fin i represents the flux into node i and Fout i represents flux out of node i
Fin i = SUM(fji) (2.1)
Fout i = SUM(fij) (2.2)
Where SUM(fij) represents flux from node i to node j and SUM(fji) represents flux from node j to node i
fij = qi x pij (3.1)
fji = qi x pij (3.2)
Where pij represents dispersal probability between node i and node j, and qi represents the quality metric (i.e. an indicator of habitat quality) for node i. Mean within-node relatedness was used as the quality metric for a given node, as per Draheim et al. (2016). To calculate qi, we used the maximum likelihood estimator within the program ML-Relate (Kalinowski et al., 2006) to calculate pairwise relatedness within each node. The mean pairwise relatedness value served as a quality metric for its respective node. Probability of dispersal was calculated as
pij = exp(k x gij) (5)
where qi is the genetic distance (a metric assessing genetic differentiation) between node i and node j, calculated in R using gstudio (Dyer, 2012) and k is a distance decay coefficient that determines the steepness of the decline in the probability of dispersing between two nodes as distance between them increases (Urban & Keitt, 2001) and is calculated as
k=log(0.05)/d (6)
Where d is equal to maximum dispersal distance. Maximum dispersal distance in black bears is typically greater in northern ranges, and smaller in areas of high food productivity (Moyer et al., 2007). Moore et al. (2014) determined that black bears on the Michigan peninsula, a similar habitat to Ontario, had a maximum dispersal distance of 187.2km for females, and 251.2km for males. Based on this information, we calculated the decay coefficient (k) using four different maximum dispersal distances (d; 150km, 200km, 250km, and 300km) and carried each resulting value of k through genetic flux calculations to assess the sensitivity of our results to these distances. After net genetic flux was calculated for each node (equation 1), these values were used as data points to visualize patterns of dispersal in ArcMap 10.7.1 using the IDW (inverse distance weighting) interpolation tool.
We next sought to assess drivers of gene flow by fitting statistical models to calculated genetic flux values (response variable), treating each node’s net flux as a data point in a linear regression. We fit a multivariate linear regression model with the covariates of bear density (measured at the node level; Fig. 1), forest zone (serving as a surrogate for food availability and measured at the WMU level; Fig. 1), and harvest density (number of black bears harvested in a given WMU divided by WMU area; Fig.1) to each of the three datasets (all individuals, male individuals, female individuals). To ensure no covariates were correlated, we ran a Pearson correlation analysis on all covariates; none of which were correlated at >0.12. Bear density estimates (bears/100km2; Howe et al., 2022) were calculated by the MNRF using a spatially explicit capture recapture approach described in (Howe et al., 2013, 2022). These density estimates were calculated using the same genotype and location data as in our genetic flux analysis. Forest zone (Fig. 1) was split into four separate covariates: Mixed Deciduous, Boreal, Boreal Buffer (coniferous), and Mixed Deciduous Buffer, where buffer regions refer to the area bordering another forest (e.g., mixed deciduous buffer is the area of mixed deciduous forest adjacent to boreal forest). If a node fell within the mixed deciduous forest zone, ‘1’ was assigned to the mixed deciduous variable, while other categories were assigned ‘0’. We dropped the boreal forest covariate so that it represented the reference class. Harvest density was calculated as a function of total harvest and total land area, per wildlife management unit (WMU; Fig. 1), where the total number of bears harvested was divided by the total land area in each WMU. Bear harvest data was obtained from hunter reporting from 1998-2018. We used the annual average number of bears harvested in each WMU over these 20 years. After fitting basic linear regressions, we used the residuals to calculate Moran’s I to determine if spatial autocorrelation (the tendency of locations close to one another to have similar values) existed between nodes. Spatial autocorrelation (p<0.05) was observed for all datasets; Therefore, we refit linear regression models with spatially correlated errors using the package spaMM (Rousset, 2017) in the R statistical software. Following fitting, we recalculated Moran’s I to ensure no residual spatial autocorrelation. We calculated 95% confidence intervals around estimated coefficients and inferred significant effects when the confidence interval did not include zero.
As a post-hoc analysis, we repeated the above-mentioned model fitting procedure using harvest rate instead of harvest density. Harvest rate is a function of population size rather than land area. We used bear density estimates to estimate total population size for each WMU. In the case that there were multiple arrays in a single WMU, we used the bear density at each array to calculate the average bear density for that WMU. We then multiplied bear density and total land area to estimate a WMU’s population size. The number of bears harvested was divided by population size to get the estimated harvest rate.
References
- Caye, K., Jay, F., Michel, O., Francois, O. (2018). Fast Interface of Individual Admixture Coefficients Using Geographic Date. The Annals of Applied Statistics, 12(1), 586-608.
- Chhatre, V. E., & Emerson, K. J. (2017). StrAuto: automation and parallelization of STRUCTURE analysis. BMC Bioinformatics, 18(1), 1–5.
- Chen, C., Durand, E., Forbes, F., Francois, O. (2007). Bayesian clustering algorithms ascertaining spatial population structure: a new computer program and a comparison study. Molecular Ecology Notes, 7(5), 747-756.
- Corander, J., Marttinen, P., Siren, J., Tang, J. (2008). Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations. BMC Bioinformatics, 9(1), 1-14.
- Draheim, H. M., Moore, J. A., Etter, D., Winterstein, S. R., & Scribner, K. T. (2016). Detecting black bear source–sink dynamics using individual-based genetic graphs. Proceedings of the Royal Society B: Biological Sciences, 283(1835). https://doi.org/10.1098/rspb.2016.1002
- Dyer, R. J. (2012). The gstudio Package.
- Dyer, R. J., Nason, J. D., & Garrick, R. C. (2010). Landscape modelling of gene flow: Improved power using conditional genetic distance derived from the topology of population networks. Molecular Ecology, 19(17), 3746–3759. https://doi.org/10.1111/j.1365-294X.2010.04748.x
- Falush, D., Stephens, M., & Pritchard, J. K. (2003). Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics, 164(4), 1567–1587.
- Howe, E. J. & Obbard, M. E. (2014). Ontario Wildlife Food Survey, 2013. Wildlife Research Series, 1.
- Howe, E. J., Obbard, M. E., & Kyle, C. J. (2013). Combining data from 43 standardized surveys to estimate densities of female American black bears by spatially explicit capture–recapture. Population Ecology, 55, 595–607. https://doi.org/10.1007/s10144-013-0389-y
- Howe, E. J., Potter, D., Beauclerc, K. B., Jackson, K. E., & Northrup, J. M. (2022). Estimating animal abundance at multiple scales by spatially explicit capture-recapture. Ecological Applications.
- Humm, J. M. (2017). Spatially Explicit Population Estimates of the Florida Black Bear.
- Janes, J. K., Miller, J. M., Dupuis, J. R., Malenfant, R. M., Gorrell, J. C., Cullingham, C. I., & Andrew, R. L. (2017). The K = 2 conundrum. Molecular Ecology, 26(14), 3594–3602.
- Koen, E. L., Bowman, J., & Wilson, P. J. (2016). Node-based measures of connectivity in genetic networks. Molecular Ecology Resources, 16, 69–79.
- Kopelman, N. M., Mayzel, J., Jakobsson, M., Rosenberg, N. A., & Mayrose, I. (2015). CLUMPAK: a program for identifying clustering modes and packaging population structure inferences across K. Molecular Ecology Resources, 15(5), 1179–1191. https://doi.org/10.1111/1755-0998.12387
- Minor, E. S., & Urban, D. L. (2008). A graph-theory framework for evaluating landscape connectivity and conservation planning. Conservation Biology, 22(2), 297–307. https://doi.org/10.1111/j.1523-1739.2007.00871.x
- Moore, J. A., Draheim, H. M., Etter, D., Winterstein, S., & Scribner, K. T. (2014). Application of large-scale parentage analysis for investigating natal dispersal in highly vagile vertebrates: A case study of American black bears (Ursus americanus). PLoS ONE, 9(3). https://doi.org/10.1371/journal.pone.0091168
- Moyer, M. A., McCown, J. W., & Oli, M. K. (2007). Factors Influencing Home-Range Size of Female Florida Black Bears. Journal of Mammalogy, 88(2), 468–476.
- Noyce, K. V., & Garshelis, D. L. (2011). Seasonal migrations of black bears (Ursus americanus): causes and consequences. Behavioral ecology and sociobiology, 65, 823-835.
- Obbard, M. E., Newton, E. J., Potter, D., Orton, A., Patterson, B. R., & Steinberg, B. D. (2017). Big enough for bears? American black bears at heightened risk of mortality during seasonal forays outside Algonquin Provincial Park, Ontario. Ursus, 28(2), 182–194. https://doi.org/10.2192/ursu-d-16-00021.1
- Pelletier, A., Obbard, M. E., Harnden, M., McConnell, S., Howe, E. J., Burrows, F. G., White, B. N., & Kyle, C. J. (2017). Determining causes of genetic isolation in a large carnivore (Ursus americanus) population to direct contemporary conservation measures. PLoS ONE, 12(2), 1–23. https://doi.org/10.1371/journal.pone.0172319
- Pritchard, J. K., Stephens, M., & Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics, 155(2), 945–959.
- R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.orgt/.
- Rayfield, B., Fortin, M.-J., & Fall, A. (2011). Connectivity for conservation: a framework to classify network measures. Ecology, 92(4), 847–858.
- Rosenberg, N. A. (2004). DISTRUCT: a program for the graphical display of population structure. Molecular Ecology Notes, 4(1), 137–138.
- Rousset, F. (2017). An introduction to the spaMM package for mixed models.
- Urban, D., Keitt, T. (2001). Landscape Connectivity: A Graph-Theoretic Perspective. Ecology, 28(5), 1205-1218.
- Kopsala, Evan; Kyle, Christopher; Howe, Eric et al. (2023). Broad‐scale genetic monitoring suggests density‐dependent dispersal in a large carnivore. Oikos. https://doi.org/10.1111/oik.09442
