Data from: The influence of human presence and footprint on animal space use in US national parks
Data files
Jun 16, 2025 version files 533.53 MB
-
data.zip
533.47 MB
-
README.md
16.98 KB
-
scripts.zip
49.72 KB
Jan 07, 2026 version files 533.53 MB
-
data.zip
533.47 MB
-
README.md
15.29 KB
-
scripts.zip
49.80 KB
Abstract
Given the importance of protected areas for biodiversity, the growth of visitation to many areas has raised concerns about the effects of humans on wildlife. In 2020, the COVID-19 pandemic led to temporary closure of national parks in the United States, offering a pseudonatural experiment to tease apart the effects of permanent infrastructure and transient human presence on animals. We compiled GPS tracking data from 229 individuals of 10 mammal species in 14 parks, and used third-order hierarchical Resource Selection Functions to evaluate the influence of the human footprint on animal space use in 2019 and 2020. Averaged across all parks and species, animals avoided the human footprint, whether the park was open or closed. However, while animals in remote areas showed consistent avoidance, on average those in more developed areas switched from avoidance to selection when protected areas were closed. Findings varied across species: some responded consistently negatively to the footprint (wolves, mountain goats), some positively (mule deer, red fox), and others had a strong exposure-mediated response (elk, mountain lion). Furthermore, some species responded more strongly to the park closure (black bear, moose). This study advances our understanding of complex interactions between recreation and wildlife in protected areas. While we do not share raw location data due to the sensitivity of animal locations, we provide complete information on the format of data files, intermediate data products, and the scripts necessary to reproduce analyses.
Description of the data scripts
Data files (data.zip)
Raw data
all-clean-tracks.csv
Movement trajectories of animals, subsampled to a 4-hour fix interval, and filtered according to inclusion criteria described in the supplementary methods of the manuscript. Each row corresponds to a unique GPS location, with associated metadata. Columns include:
- ID_full: unique identifier of individual animal, combining information from columns Park, Species, ID, and Period
- ID: unique identifier of individual animal
- Park: unique identifier of protected area (4-letter NPS code)
- Species: unique identifier of species (4-letter code corresponding to the first two letters of the genus and first two letters of species)
- Period: the time period (year) of sampling, either "DuringClosure_2019" or "DuringClosure_2020"
- Longitude
- Latitude
- DateTimeUTC: date and time of the GPS fix, in Coordinated Universal Time
- DateTimeLocal: date and time of GPS fix, in local (standard) time
Used as input in multiple scripts:
- spatial-data-preparation/create-bounding-boxes.R
- rsf-analysis/01-calculate-home-ranges.R
- rsf-analysis/02-generate-available-points.R (to get list of unique ID_full)
- rsf-analysis/03-prepare-data-for-rsf.R
- second-stage-models/calculate-footprint-summary.R (to get list of unique ID_full)
Note that this data file is not shared, due to the sensitivity of location data.
Derived data
kde folder
Folder containing shape files corresponding to the 95% Kernel Density Estimates of home ranges for all individual-years. Not provided due to data sensitivity.
Generated by rsf-analysis/01-calculate-home-ranges.R script.
Used in rsf-analysis/02-generate-available-points.R and spatial-data-preparation/create-bounding-boxes.R.
bounding-boxes.csv
File with coordinates of bounding boxes for the study area associated with each population. Bounding boxes were determined by taking the minimum and maximum latitude and longitude of the calculated 95% Kernel Density Estimate home ranges. Columns include:
- Park_Species: unique identifier of protected area (4-letter NPS code) combined with unique identifier of species (4-letter code corresponding to the first two letters of the genus and first two letters of species)
- xmin: Minimum longitude in the study area
- xmax: Maximum longitude in the study area
- ymin: Minimum latitude in the study area
- ymax: Maximum latitude in the study area
Generated from rsf-analysis/create-bounding-boxes.R script.
Used in all spatial-data-preparation scripts.
all-footprint-summary.csv
Each row corresponds to an individual animal-year. Generated from second-stage-models/calculate-footprint-summary.R.
- ID_full: unique identifier of individual animal, combining information from columns Park, Species, ID, and Period
- footprint_mean: mean value of human footprint in home range
- footprint_min: minimum value of human footprint in home range
- footprint_med: median value of human footprint in home range
- footprint_max: maximum value of human footprint in home range
data-for-rsf.csv
Derived data used to fit Resource Selection Functions. Each row represents either a GPS location used by an animal (Used = 1) or a randomly generated point in the animal's 95% KDE home range (Used = 0). The ratio of used: available points is 1:10. In the script workflow, this file is split into separate files for each park-species, but here, it is provided as a single combined file. Columns include:
- ID_full: unique identifier of individual animal, combining information from columns Park, Species, ID, and Period
- Used: binary variable representing whether the point correspond to a location that the animal used (1) or a random location available in its home range (1)
- Park: unique identifier of protected area (4-letter NPS code)
- Species: unique identifier of species (4-letter code corresponding to the first two letters of the genus and first two letters of species)
- Park_Species: unique identifier of protected area (4-letter NPS code) combined with unique identifier of species (4-letter code corresponding to the first two letters of the genus and first two letters of species)
- Period: the time period (year) of sampling, either "DuringClosure_2019" or "DuringClosure_2020"
- ID: unique identifier of individual animal
- footprint: value of human footprint index, ranging from 0 to 1
- building: distance to nearest building
- camp: distance to nearest campground
- parking: distance to nearest parking lot
- trail: distance to nearest trail
- nlcd_barren_dst: distance to NLCD barren habitat class
- nlcd_forest_dst: distance to NLCD forest habitat class
- nlcd_herb_dst: distance to NLCD herbaceous habitat class
- nlcd_scrub_dst: distance to NLCD scrub habitat class
- water: distance to water
- elevation
- slope
- footprint_sum: alternative version of human footprint (algebraic sum rather than fuzzy algebraic sum)
- footprint_built: alternative version of human footprint (excluding trails)
- footprint_equal: alternative version of human footprint (equal weighting of all features)
- footprint_decay500: alternative version of human footprint (longer decay function)
Generated by rsf-analysis/03-prepare-data-for-RSF.R
Used as input for rsf-analysis/04-run-RSF.R
Scripts (scripts.zip)
rsf-analysis
This folder contains scripts for preparing data and fitting Resource Selection Functions.
01-calculate-home-ranges.R
Script to calculate 95% Kernel Density Estimates of individual home ranges and export shape files, to be used for generating available points for the Resource Selection Functions.
Input file is all-clean-tracks.csv.
Output files are multiple shapefiles (one per individual-year) in data/kde folder.
02-generate-available-points.R
Script to generate all available points within the home range of each individual animal.
Input files are the shape files in the data/kde folder, and the spatial covariate raster files.
Output files are one csv file per individual, containing the raster values at each pixel (30x30m) within the individual's home range (one per individual-year).
03-prepare-data-for-rsf.R
Script to generate input data files for Resource Selection Functions. Extracts covariate values for all GPS locations, and combines with a random subset of available points at a fixed ratio of used:available points.
Input files are all-clean-tracks.csv, the output files from rsf-analysis/02-generate-available-points.R, and the spatial covariate raster files. Sources the script calculate-footprints-from-df.R.
Output is a single file for each population, combined here into one file data-for-rsf.csv.
04-run-RSF.R
Fit Bayesian Resource Selection functions, one population at a time. Script is run separately for each population.
The RSFs are generalized linear mixed models that included population-level hierarchical effects with parameters estimated through a Bayesian framework. We ran a separate model for each population (park-species) with an identical modeling framework for all species. Model covariates included human footprint, elevation, slope, and distances to five land cover types: forest, herbaceous, scrub, barren, and water. We allowed the intercepts and all selection coefficients to vary across individual-years, and individual coefficients were drawn from population distributions. We also estimated posterior distributions for the population mean footprint selection coefficient across individuals for 2019 and for 2020, and for the difference in population means between years.
Input file is data-for-rsf.csv (in workflow, uses separate file for each population). Sources rsf-model.text.
Output files are JAGS model output in .Rds format, and .csv files that summarize individual beta coefficients, individual changes in footprint selection, and population mean beta coefficients.
rsf-model.txt
JAGS model code for Resource Selection Functions. Sourced in 04-run-RSF.R.
second-stage models
This folder contains scripts for the functional response models, which explore how individual selection for the human footprint varies as a function of the mean footprint in the individual's home range.
We assessed how exposure to the human footprint mediated individual selection or avoidance of the footprint by considering a functional response in resource selection (i.e. change in selection as a function of variation in availability). We also compared how this functional response varied between 2019 (normal visitation) and 2020 (park closures). To do so, we ran Bayesian linear models where the response variable was the individual-year beta selection coefficient for footprint as estimated from the RSF, and the explanatory variable was the mean footprint value of all available locations in that individual-year’s home range (95% KUD). We estimated a separate slope for 2019 and 2020, given expected differences in the functional response to footprint depending on whether the parks were open or closed to visitation.
calculate-footprint-summary.R
Script to calculate mean, median, minimum, and maximum values of the human footprint in each individual animal's home range (95% Kernel Density Estimate).
Input files are all-clean-tracks.csv and the available points generated in rsf-analysis/02-generate-available-points.R.
Output file is all-footprint-summary.csv (used in functional response models).
functional-response-global-run.R
Script for running the global functional response model. In this global model, the slopes for each year were the same across all parks and species, with a random intercept for population. Input data are the beta coefficients exported in rsf-analysis/04-run-RSF.R and all-footprint-summary.csv. Sources functional-response-global.txt.
functional-response-global.txt
JAGS model code, sourced in functional-response-global-run.R.
functional-response-by-guild-run.R
Script for running the guild functional response model. In this model, the slopes varied not only by year but by guild (large carnivore and ungulate; we excluded red fox, as they were the only small carnivore). Input data are the beta coefficients exported in rsf-analysis/04-run-RSF.R and all-footprint-summary.csv. Sources functional-response-by-guild.txt.
functional-response-by-guild.txt
JAGS model code, sourced in functional-response-by-guild-run.R.
functional-response-by-species-run.R
Script for running the species functional response model. In this model, we estimated a random slope and intercept for each species. Input data are the beta coefficients exported in rsf-analysis/04-run-RSF.R and all-footprint-summary.csv. Sources functional-response-by-species.txt.
functional-response-by-species.txt
JAGS model code, sourced in functional-response-by-species-run.R.
functional-response-individual-change-run.R
Script for running the individual change functional response model. We were interested in how individual animals changed their selection of the footprint from 2019 to 2020, and how this change varied across individuals as a function of their exposure to human disturbance. For the subset of individuals for which we had tracking data from both years, we ran a functional response model in which the response variable was the individual change in the footprint selection coefficient from 2019 to 2020, and the predictor was the mean footprint in the individual’s 2019 home range.
Input data are the delta coefficients exported in rsf-analysis/04-run-RSF.R and all-footprint-summary.csv. Sources functional-response-individual-change.txt.
functional-response-individual-change.txt
JAGS model code, sourced in functional-response-individual-change-run.R.
spatial-data-preparation
This folder contains scripts for preparing spatial covariates, used in the above analyses. The raw input data are sourced directly via R packages (osmdata, FedData) or downloaded from US government websites, including National Park Service Official Service-wide Datasets. https://www.arcgis.com/home/group.html?id=00f2977287f74c79aad558708e3b6649#overview
create-bounding-boxes.R
Script to generate bounding boxes of each study area (park-species) based on the calculated 95% Kernel Density Estimate home ranges. Input files are all-clean-tracks.csv, and the shape files in the kde folder (not provided). Output file is bounding-boxes.csv.
get_distance.R
Defines function for rasterizing spatial objects and creating 'distance to feature' raster. Sourced in scripts below.
make_elev_raster.R
Script to generate elevation and slope rasters for each study area, using raw data downloaded via the FedData package.
make_nlcd_raster.R
Script to crop the National Land Cover Data raster (downloaded from https://www.mrlc.gov/data) to study areas and export resulting rasters.
make-nlcd-distance-rasters.R
Script to bin NLCD land covers into broad categories, and generate rasters that represent the distance to the nearest pixel of a given land cover type. Uses the rasters generated by make_nlcd_raster.R.
make_building_raster.R
Script to generate distance-to-building raster for each study area, using public data from National Park Service Official Service-wide Datasets, and data downloaded from OpenStreetMaps.
make_camping_raster.R
Script to generate distance-to-campground raster for each study area, using data downloaded from OpenStreetMaps. The resulting rasters have three bands, corresponding to the distances to campgrounds of three different size bins.
make_parking_raster.R
Script to generate distance-to-parking lot raster for each study area, using public data from National Park Service Official Service-wide Datasets. The resulting rasters have three bands, corresponding to the distances to parking lots of three different size bins.
make_road_raster.R
Script to generate distance-to-road raster for each study area, using data downloaded from OpenStreetMaps. The resulting rasters have two bands, corresponding to the distances to major and minor roads.
make_trail_raster.R
Script to generate distance-to-trail raster for each study area, using public data from National Park Service Official Service-wide Datasets, and data downloaded from OpenStreetMaps.
calculate-footprints.R
Functions for calculating various versions of the human footprint (including for sensitivity analyses) based on dataframes with distance-to-feature values. Sourced in rsf-analysis/03-prepare-data-for-rsf.R.
make_footprint_raster.R
Generate raster of human footprint for each study area, based on distance-to-feature rasters generated by other scripts. Sources calculate-footprints.R.
Sharing/access Information
We do not provide the unformatted raw data and associated data-cleaning scripts, nor any GPS locations or home range files, due to government data sharing agreements and concerns about sharing sensitive location data of wildlife.
Changes after Jun 16, 2025: Changed filtering of spatial features in the scripts make_parking_raster.R, make_road_raster.R, and make_footprint_raster.R to align with the methods used in the manuscript.
