Habitat availability is insufficient to explain regional variations in white stork breeding habitat preference
Abstract
Understanding species-habitat associations is key for making predictions of species distributions of relevance to ecology and conservation. Regional differences in species habitat preferences can hinder the transferability of habitat models in space and time, but our ability to account for these differences will depend on the mechanisms underlying them (differences in habitat availability, genetics, culture). Here, we modelled the large-scale breeding distribution of an expanding species, the white stork Ciconia ciconia in France, applying machine-learning algorithms to an extensive dataset of the distribution of nests spanning the whole country. Specifically, we assessed the transferability of the models across different geographic zones and contrasted the modelled nesting habitat preferences of the species across these zones. Finally, we assessed whether local differences in model transferability were related to habitat availability in each zone. Our models generally had good calibration performances, but were not equally transferable to all zones. Additionally, environmental variables did not have the same effects in the different zones, with particularly striking differences between Alsace and the rest of France. This included a certain preference for urban areas in Alsace – absent from other zones - that was consistent with their tendency to nesting on buildings in that zone. Differences in habitat availability between Alsace and the rest of France, as well as connectivity within the French white stork metapopulation appeared to be insufficient to explain the lack of transferability of models to this zone, suggesting some possible local historical and cultural effects on habitat selection.
Dataset DOI: 10.5061/dryad.fxpnvx148
Description of the data and file structure
The aim of the study was to model the nesting distribution of white storks in France. Nest location data were collected nationwide and combined with land use data, using species distribution models to infer nesting habitat preferences across different regions of France.
Files and variables
File: data.zip
Description: This folder contains a table and two shapefiles:
- final_dataset_nest_locs.csv: locations of white stork nests;
- locs_buffer_with_vars.gpkg: nest locations with associated extracted environmental variables;
- grid_buffer_with_vars.gpkg: extracted environmental variables across a 1km x 1km grid covering France;
Column names:
- annee: year of census;
- departement: department of census;
- support_clean: support categories (following Table S1 in the manuscript) - "arbre" = tree, "plateforme ou poteau" = platform or pole, "pylone" = pylon, "batiment" = building, "autre" = other;
- longitude and latitude: coordinates in WGS84 - coordinates are rounded to the 3rd decimal in the public dataset for privacy reasons (nests potentially falling on private property);
- id_ini: nest location ID, to be used to merge datasets;
- problems: potential problems in the original coordinates (solved issues are indicated);
- dist_nearest_pylon: distance to the nearest RTE pylon;
Note that some column names and column contents are in French because the census was carried out by French ornithological associations. They were not translated to ensure the reproducibility of analyses. Column names and nest support categories (in the support_clean column) are translated above.
Code/software
File: scripts.zip
Description: This folder contains scripts to run the analyses as described in the methods of the published manuscript.
- 00_main_nest_distribution_models.R is the main script that calls all others.
- 01_final_dataset.R is the script to clean the dataset to be used in subsequent analyses. This is provided for illustration only, but the output of this code is provided (final_dataset_nest_locs.csv) and can be used to run subsequent R scripts.
- 01b_support_cleaning_function.R is a function to clean and harmonise the text describing the type of supports.
- 02_matching_pylons_with_rte.R is a code to match the nest location data with the database of electric pylons, to refine the categorisation of nesting supports. This requires a database of RTE pylons that can be downloaded from the OpenData Réseau-Énergies website (odre.opendatasoft.com).
- 03_zones_separation.R is a code to run the DBSCAN algorithm and manually refine the definition of the 13 zones used in the analyses.
- 04_pseudo_absences_by_region.R is a code to generate pseudo-absences.
- 05_variable_extraction_function.R is a function called in 06 variable extractions.R
- 06_variable_extractions.R is a code to extract environmental variables around each nest location and pseudo-absence, and around the centroid of each 1km x 1km grid cell. This requires environmental layers that can be downloaded from www.data.gouv.fr/fr/datasets/corine-land-cover-edition-2018-france-metropolitaine/ and https://geoservices.ign.fr/bdtopo
- 07_habitat_models_functions.R is a function called in 08 habitat models.R and contains ad-hoc functions for use within the biomod2 package.
- 08_habitat_models.R is a code to run all species distribution models.
