Data from: Strong variation in land-use change impacts on tropical avian phylogenetic diversity between ecoregions highlights the need to sample large spatial scales
Data files
Jan 20, 2026 version files 489.71 MB
-
Files_to_run_scripts.zip
489.69 MB
-
README.md
4.65 KB
-
Scripts.zip
19.86 KB
Abstract
Forest conversion for agriculture is a major cause of tropical biodiversity loss. Quantifying the biodiversity impacts of forest loss is challenging because the severity of outcomes is influenced by spatial scale, with the higher rate of species turnover in forests than in farmland increasing the severity of losses at larger relative to smaller scales. Conservation efforts increasingly prioritise phylogenetic diversity to preserve unique evolutionary history under global change, but how deforestation-driven changes in phylogenetic diversity vary across large spatial scales remains a key question. We compiled a large field database from across 13 biogeographically diverse regions affected by deforestation for cattle farming, covering most of Colombia, a megadiverse tropical country. We use occupancy models to estimate bird communities for 1,614 species across 13 ecoregions and nationally in both forest and pasture habitats. This dataset includes base information and scripts to quantify six different phylogenetic diversity metrics in forest and pasture habitats. and the process to compare changes between habitats and the differences from regional to national scales. It also includes data to estimate and plot changes in phylogenetic trees for bird communities across ecoregions and to replicate the study area map. We found that although single regional-scale loss of phylogenetic diversity was, on averag,e comparable to broader scales, there was high variability between regional units. Such underestimation of national-scale impacts highlights the importance of sampling across multiple regions.
Dataset DOI: 10.5061/dryad.6hdr7srcx
Description of the data and file structure
All data and scripts are provided in a zip folder.
Dataset includes field detections of birds, model outputs and spatial data from Socolar and Mills et al (2025), but adapted for the aims of this research (i.e not identical to their datasets)
Scripts follow the procedures to take occupancy probability of each bird species in forest and pasture habitats and obtain avian communities across ecoregions and national scale in Colombia, then use them along with phylogenetic tress to six compute metrics of phylogenetic diversity at each ecoregions and then by combining ecoregions to reach a national scale. finally it assesed misestimation of phylogenetic diversity change from ecoregion compared to national levels.
Files and variables
File: Files_to_run_scripts.zip
Description: Zip file containing all datasets
- Birds_rds_link.txt: link to Zenodo storage were raw field data is stored (https://doi.org/10.5281/zenodo.15318727)
- Birdtree_COL_taxonomy.csv: curated list of species and taxonomic categories. Columns: TipLabel: Bird species names following birdtree taxonomy, Clade: species clade classification, Family: species taxonomic family classification, Order: species taxonomic order classification, OscSubOsc: species subclade classification.
- cell_WWF_lookup.rds: R software object, lookup table to match id cells to ecoregion names. Columns: label_id: consecutive numeric identifier for ecoregions, id_cell: numeric identifier for individual cells within ecoregions, label: ecoregion names.
- CO_birds_Hacket_ALL.rds: R software object,10000 Phylogenetic trees of all Colombian birds included in the research
- ED_values. rds: R software object containing average values of evolutionary distinctiveness and rarity computed across 10000 phylogenetic trees for each species. Columns: species: Bird species names following birdtree taxonomy, ED_mean: average evolutionary distinctiveness, EDR_mean: average evolutionary distinctiveness rarity values, EDGE_median: evolutionary distinct and globally endangered median values, RL.cat: IUCN redlist category.
- lpo_and_coefs.rds: R software object, linear predictors and coeficients (400 draws) from model in Socolar and Mills et al., (2025), modified to this research aims. Variables: species: Bird species names following birdtree taxonomy, lpo_forest: linear predictor for forest habitat, lpo_pasture: linear predictor for pasture habitat.
- regions_sequences.rds: R software object, contains random sequences of agregating regions (for reproducibility)
- species_names_matching.rds: R software object, lookup table to match species names across different taxonomic treatments and old/new names. Columns: model: Bird species names following ebird taxonomy, phylogeny: Bird species names following birdtree taxonomy
- WWF_lookup.rds: R software object, lookup table to match ecoregion names (spatial). Columns: label: ecoregions full names, label_id: consecutive numeric identifier for ecoregions.
- WWF_terrestrial_ecoregions.rds: R software objetc, shapefiles for sampled ecoregions
- xy_lookup.rds: R software object, lookup table to match id cells to model id. Columns: id_cell: numeric identifier for individual cells within ecoregions, id_x: model cell location in x, id_y: model cell location in y.
Code/software
All analyses were conducted in R studio and R version 4.4.1
File: Scripts.zip
Description: Zip file containing all R software scripts
1_compute_predicted_occupancy.R: Spatially explicit predict occupancy probability in forest and pasture at 2x2 km2 cells across study area.
2_Phylogeny_plots.R: Plot phylogenies across ecoregions
3_PD_metrics.R: compute phylogenetic diversity metrics in habitats across ecoregions
4_PD_Accumulation.R: compute cummulative phylogenetic diversity metrics in habitats at increasing scale
5_study_area.R: replicate study area figures and sampling points
Access information
Other publicly accessible locations of the data:
- https://github.com/gapz01/PhD_year_two (only scripts)
Data was derived from the following sources:
- https://zenodo.org/records/15318728 (Creative Commons Attribution 4.0 International)
