Predicted distribution of a rare and understudied forest carnivore: Humboldt martens (Martes caurina humboldtensis)
Data files
Jul 13, 2021 version files 335.39 MB
Abstract
Many mammalian species have experienced range contractions. Following a reduction in distribution that has resulted in apparently small and disjunct populations, the Humboldt marten (Martes caurina humboldtensis) was recently designated as federally Threatened and state Endangered. This subspecies of Pacific marten occurring in coastal Oregon and northern California, also known as coastal martens, appear unlike martens that occur in snow-associated regions in that vegetation associations appear to differ widely between Humboldt marten populations. We expected current distributions represent realized niches, but estimating factors associated with long-term occurrence was challenging for this rare and little-known species. Here, we assessed the predicted contemporary distribution of Humboldt martens and interpret our findings as hypotheses correlated with the subspecies’ niche to inform strategic conservation actions. We modeled Humboldt marten distribution using a maximum entropy (Maxent) approach. We spatially-thinned 10,229 marten locations collected from 1996–2020 by applying a minimum distance of 500-m between locations, resulting in 384 locations used to assess correlations of marten occurrence with biotic and abiotic variables. We independently optimized the spatial scale of each variable and focused development of model variables on biotic associations (e.g., hypothesized relationships with forest conditions), given that abiotic factors such as precipitation are largely static and not alterable within a management context. Humboldt marten locations were positively associated with increased shrub cover (salal (Gautheria shallon)), mast producing trees (e.g., tanoak, Notholithocarpus densiflorus), increased pine (Pinus sp.) proportion of total basal area, annual precipitation at home-range spatial scales, low and high amounts of canopy cover and slope, and cooler August temperatures. Unlike other recent literature, we found little evidence that Humboldt martens were associated with old-growth structural indices. This case study provides an example of how limited information on rare or lesser-known species can lead to differing interpretations, emphasizing the need for study-level replication in ecology. Humboldt marten conservation would benefit from continued survey effort to clarify range extent, population sizes, and fine-scale habitat use.
Methods
Please see manuscript (PeerJ: DOI:10.7717/peerj.11670) or pre-print for details (https://www.biorxiv.org/content/10.1101/2021.02.05.429381v1?rss=1).
We used spatially-referenced Humboldt marten locations collected between 1996 and 2020 in California and Oregon. We excluded locations occurring in areas that were modified by fire or timber harvest after the location date and prior to 2016, the date represented by our vegetation data. If multiple locations occurred within a 500-m x 500-m grid cell, we spatially-thinned locations to randomly include one in each cell, attempting to achieve spatial independence for modeling (Kramer-Schadt et al., 2013). Priority for location retention from highest to lowest was: (1) rest and den locations from telemetry (Linnell et al., 2018; Delheimer et al., In press); (2) locations from scat dog detection surveys (Moriarty et al., 2018; Moriarty et al., 2019); and (3) locations from baited camera and/or track plate surveys (Slauson, Baldwin & Zielinski, 2012; Barry, 2018; Gamblin, 2019; Moriarty et al., 2019). We used presence-only data because surveys that occurred prior to 2014 were often missing detection histories from non-detection (e.g., absence) locations.
Predicted distribution
We used Maxent modeling software v3.4.1 (Phillips, Anderson & Schapire, 2006) to estimate the relative probability of Humboldt marten presence (Merow, Smith & Silander, 2013). Maxent uses a machine learning process to develop algorithms that relate environmental conditions at documented species’ presence locations to that of the surrounding background environment in which they occurred (Phillips & Dudík, 2008; Elith et al., 2011). We excluded variables with highly correlated predictors (|Pearson coefficient| > 0.6), selecting the variable that was most interpretable for managers (Table S2). During this process, we considered the variance inflation (Table S3), which allows for evaluation of correlation and multicollinearity. Variance inflation factors equal to 1 are not correlated and factors greater than 5 are highly correlated as determined by (1/(1-R i2)), where Ri2 is squared multiple correlation of the variable i (Velleman & Welsch, 1981).
Within each model iteration, we selected the bootstrap option with 10 replicates, random seed, and 500 iterations. We trained our models using a random subset of 75% of presence locations and tested these using the remaining 25% with logistic output. We used the default of 10,000 pseudo-absence background samples. We varied the response functions to include linear, product, and quadratic features. We selected the “auto features” option for all runs, which allows Maxent to further limit the subset of response features from those selected by retaining only those with some effect.
Species distribution maps were produced from all models using the maximum training sensitivity plus specificity threshold, which minimizes both false negatives and false positives. We evaluated the AUC statistic to determine model accuracy and fit to the testing data (Fielding & Bell, 1997). The AUC statistic is a measure of the model’s predictive accuracy, producing an index value from 0.5 to 1, with values close to 0.5 indicating poor discrimination and a value of 1 indicating perfect predictions (Elith et al., 2006). We assessed variables using response curves, variable contributions, and jackknife tests. We used percent contribution and permutation importance to determine importance of input variables in the final model (e.g., Halvorsen (2013). Percent contribution can be more informative with uncorrelated variables (Halvorsen 2013), while permutation importance provides better variable assessment when models and variables are correlated (Searcy & Shaffer (2016).
Because over-parameterized models tend to underestimate habitat availability when transferred to a new geography or time period, we used selection methods suggested by Warren & Seifert (2011). Maxent provides the option of reducing overfitting with a regularization multiplier that can be altered by the user to apply a penalty for each term included in the model (β regularization parameter) to prevent overcomplexity or overfitting (Merow, Smith & Silander, 2013; Morales, Fernández & Baca-González, 2017). A higher regularization multiplier will reduce the number of covariates in the model, becoming more lenient with an increased sample size (Merow, Smith & Silander, 2013). We did not include model replicates, an option in the interface, to output the required data (lambda file) and set output to logistic. We altered the Regularization Multiplier from 0.5 to 4 for each 0.5 increment (e.g., Radosavljevic & Anderson (2014).
We ranked candidate models using AIC corrected for small sample sizes (AICc; Burnham & Anderson 2002). We considered the model with the lowest AICc value to be our top model with those with ΔAICc<2 to be competitive models. For our top model, we generated predicted-to-expected (P/E) ratio curves for our model using only the testing data to evaluate its predictive performance, which was based on the shape of the curves, a continuous Boyce index (Boyce et al., 2002), and Spearman rank statistics. We used the predicted-to-expected curve to inform our suitability thresholds following Hirzel et al. (2006), including predicted unsuitable (P/E and confidence intervals 0-1), marginal (P/E > 1 but overlapping confidence intervals), and suitable (P/E and confidence intervals > 1).
Usage notes
Please see manuscript (PeerJ: DOI:10.7717/peerj.11670) or pre-print for details (https://www.biorxiv.org/content/10.1101/2021.02.05.429381v1?rss=1). This data set includes a vector shapefile with suitability thresholds generated from the predicted to expected curve as well as a continuous raster.