# Data from: Assessing the effectiveness of a national protected area network in maintaining carnivore populations

## Cite this dataset

Terraube, Julien; Van doninck, Jasper; Helle, Pekka; Cabeza, Mar (2020). Data from: Assessing the effectiveness of a national protected area network in maintaining carnivore populations [Dataset]. Dryad. https://doi.org/10.5061/dryad.s1rn8pk4w

## Abstract

Protected areas (PAs) are essential to prevent further biodiversity loss yet their effectiveness varies largely with governance and external threats. Although methodological advances have permitted assessments of PA effectiveness in mitigating deforestation, we still lack similar studies for the impact of PAs on wildlife populations. Here we demonstrate the application ofuse an innovative combination of matching methods and hurdle-mixed models with a large-scale and long-term dataset of unprecedented coverage for Finland’s large carnivore species. We show that the national PA network , at the national level, PAs does not support higher densities than non-protected habitat for 3 of the 4 species investigated. For the brown bear, PAs appear to have lower densities than non-protected areas. For some species, PA effects interact with region or time, i.e. wolverine densities decreased inside PAs over the study period and lynx densities increased inside eastern PAs. Although we show that matching approaches could and should be applied to wildlife population data, Wwe support their application of matching methods in combination of additional analytical frameworks for deeper understanding of conservation impacts on wildlife populations. These methodological advances are crucial for improving PA targets and extremely timely for preparing ambitious PA targets a post-2020 global framework for biodiversity.

## Methods

#### Wildlife population time series data

Track count data collected within the Finnish Wildlife Triangle Scheme were used as indices of lynx, wolf, wolverine and bear densities. The scheme is a long term, large-scale survey which provides yearly estimates of the distribution and relative abundance of game species. The methods are described in detail by Linden et al.55. A wildlife unit is a permanent line transect route of 12 km (4 x 3 km). Unit locations represent different forest habitats in proportion to local occurrence. The monitoring operation is carried out twice a year. The minimum number of monitoring experts is three for the late-summer count and one for the winter count.

Each winter, 800–900 units distributed throughout Finland are surveyed by local hunters who count the number of fresh snow tracks crossing the transect. The count is either done 1–2 days after a snow fall or 24 h after a pre-check where all old tracks are marked. For each species, the data are compiled by the Finnish Game and Fisheries Research Institute as the number of crossings per 24 h per 10 km. Thus, the unit used in statistical modelling was the average number of crossings per 24 h per 10 km for lynx wolf and wolverine, per triangle and for each year. As brown bears hibernate during winter counts, density indices for this species were collected during late summer counts and also focused on counting tracks crossing the transect.

From the initial 2171 sampling units, a total of 1805 units were included in subsequent analyses (we discarded transects in mixed agricultural-forest landscapes), 1195 were located in non-protected sites while 610 were located inside protected areas.

#### Explanatory variable selection and extraction

The set of matching covariates extracted for each wildlife unit represented ecological characteristics considered most likely to be important determinants of large carnivore occurrence in European ecosystems based on empirical observation56. These covariates can be regarded as two groups of possible influences: biophysical context (e.g. latitude) and human impacts (e.g. distance to closest settlements). Several variables suggested by the literature to be important in determining PA effectiveness, e.g. PA management, were available for a restricted amount of sites and therefore were not included in the matching analysis. Description of the matching covariates and the rationale for their inclusion are found in the Supplementary Table S8.

Spatial data were analyzed using the R packages ”sp”, “raster”, “rgdal” and “rgeos”57. PA boundaries were calculated using spatial information from the World Database of Protected Areas58. A sampling unit was considered as protected if the 1km circular buffer centered on the unit’s centroid intersected a protected area polygon. Other explanatory variables corresponding to each wildlife unit were extracted for the unit’s centroid –for distance variables or gridded variables with a spatial resolution larger than the unit’s dimensions– or over a circular buffers centered on the unit’s centroid and passing through the three vertices –for gridded variables with a spatial resolution smaller than the unit’s dimensions (Supplementary Table S8).

#### Estimating protected area effectiveness with matching methods

#### Matching methods

We used the Matchit package59 in R environment, which fits a logistic generalized linear model where the treatment assignment (land protection) is the response variable and the matching variables are the predictors.

We chose one-to-one, nearest-neighbor covariate matching with replacement using a generalized version of the Mahalanobis distance metric. We used a caliper of 0.2 standard deviations of the propensity scores as our minimum matching criterion. To assess the quality of the matches, we checked the resulting covariate balance testing the normalized differences in covariate means and their distribution. The normalized difference in means is the difference in the average covariate value divided by its standard deviation60. We tested for differences in the distribution using eQQ plots that graph the covariate values in the same quantile of the treated (protected sites) against those in the control (non-protected sites), allowing us to observe if characteristics are distributed similarly across both treatment and control groups (see Supplementary Figures, S1a and S1b).

We were able to match 100% of the original sample to controls that suited the criteria. Therefore, our unit-to-unit matching yielded a final dataset of 1220 sampling units, i.e. 610 protected wildlife units that were matched to 610 similar unprotected units across Finland. These matched pairs of PA and non-PA sampling units covering, a period of ~30 years, were included in subsequent analyses. These sampling units were not homogeneously distributed among the four Game Management clusters with Central Finland being the most represented cluster and Lapland the least (see Supplementary Table S2).

#### Measuring the absolute effect of PAs

Previous studies have quantified the effect of PAs in reducing threats in different ways (Absolute PA effect, Relative PA effect, Pooled relative effect17). However, due to limitations imposed by the structure of wildlife time series and the natural low densities of large carnivores (zero inflation of all time-series), an effectiveness metric based on the ‘absolute PA effect’ was the most relevant approach for this study.

The absolute PA effect is the difference between densities in a unit located inside a PA and its matched control unit located in a non-protected area. Therefore a positive value means that sampling units located inside PA show higher large carnivore densities than its control unprotected units. We calculated this metric at the national PA network level, for each pair of ‘protected-non protected’ units. We computed the median absolute effect of the PA network and its associated 95% confidence interval for each species of large carnivore, globally and at the regional level. We performed iterative random sampling (1000 iterations) to control for differences in the number of repeated observations (number of years) among pairs of matched units. This was done by selecting only one value of absolute PA effect for each pair of matched units before estimating the median PA effect across units. To calculate the absolute PA effect per region, we pooled the 15 Game Management Areas covering the whole country in four main clusters: South= ’Satakunta’ + ‘Etelä-Häme’ + ‘Kaakkois-Suomi’ + ‘Uusimaa’ + ‘Varsinais-Suomi’ + ‘Pohjois-Häme’; Central= ‘Etelä-Savo’ + ‘Rannikko-Pohjanmaa’ + ‘Keski-Suomi’ + ‘Pohjois-Savo’ + ‘Pohjois-Karjala’ + ‘Pohjanmaa’; North= ‘Oulu’ + ‘Kainuu’; Lapland = ‘Lapland’. We compared the annual absolute PA effect per year with zero using one sample t-tests.

#### Assessing Estimating PA effectiveness with two-part mixed effects model for semi-continuous data.

We extracted the matched dataset obtained through the matching process described previously (1220 unpaired wildlife units, see Supplementary Table S7) to test for the effect of land protection status on large carnivore densities. We implemented two-part zero-inflated mixed effects models in the novel GLMMadaptive package that uses adaptive Gaussian quadrature61. This approach allowed us to account for the data structure (zero-inflated, right-skewed continuous data, GLMMzi) and assess the relationship between density indices of the four large carnivore species and the set of explanatory variables described in Table S8. All models were implemented in R using the packages GLMMadaptive and parallel. To account for potential problems of pseudoreplication, unit identity number was kept consistently as a random effect in all models for each of the 4 large carnivore species. Zero-inflated structures were added on all the fixed effects included in the models.

We reduced the full list of variables based on co-linearity and biological relevance to produce a set of 8 variables (e.g. collinearity between distance to closest roads and distance to closest settlements was too high and we chose to remove distance to closest roads from all the models). These 8 covariates included the 6 matching covariates described earlier (percentage of forest cover, terrain ruggedness, distance to closest settlements, human population density, latitude and longitude), to which we added year and protection status of the wildlife unit (protected or not, coded 0/1). Three interaction terms between fixed effects were also added (between PA and Latitude, PA and Longitude and PA and years) to assess spatial and temporal variations in PA effectiveness.

A set of 12 models was built for each species. All 12 models included the 6 matching covariates and protection status at the unit scale (PA), therefore the difference in model structure resided in the addition of year, the three interactive terms and their different possible combinations (structure of the 12 models is described in the Supplementary Table S9). All first-order model fits were ranked using the Akaike Information Criterion, the best model having the lowest AIC values from the set of 12 models built for each species.

The fixed effects estimates in mixed models with nonlinear link functions have an interpretation conditional on the random effects. To take this into account, we extracted the marginalized coefficients and their standard errors from the two part mixed models following the approach described by Hedeker et al.62 and implemented by Rizopoulos61 in the GLMMadaptive package.

Goodness-of-fit of the models was assessed using residual diagnostics following the procedures described in the DHARMa package. All statistical analyses were performed using the software R 3. 5. 157.

In order to test if PA size could affect our results, we built four additional models following the same procedure highlighted above. Data were restricted to protected wildlife units, density of the four species of carnivores was the response variables and covariates included: percentage of forest cover, human population density, distance to the closest settlements, terrain ruggedness, latitude, longitude, year and PA size. We extracted the marginalized coefficients and checked the residuals of the different models as highlighted above.

## Funding

Maj and Tor Nessling Foundation, Award: 201700201