Climatic niche variation in genetically distinct populations throughout the annual cycle for a migratory parulid bird, Cardellina pusilla
Data files
Jan 04, 2026 version files 425.67 KB
-
README.md
7.16 KB
-
wiwa-96snp-genotypes.breeding.rubias_input.txt
174.85 KB
-
WIWA.env_vals.breed_nonbreeding.combined.csv
183.66 KB
-
WIWA.Genaro_samples107.Master_meta.txt
13.19 KB
-
WIWA107.FinalPanel.GP1_3.fixed.rubias_input.txt
46.81 KB
Mar 03, 2026 version files 604.70 KB
-
01-capu_breeding_pop_range.Rmd
11.26 KB
-
02-capu_eBird_range_map.Rmd
16.33 KB
-
03-capu_winter_pop_range.Rmd
9.94 KB
-
04-climatic_niches.R
116.50 KB
-
05-Capu_manuscript_figures.R
16.70 KB
-
README.md
15.46 KB
-
wiwa-96snp-genotypes.breeding.rubias_input.txt
174.85 KB
-
WIWA.env_vals.breed_nonbreeding.combined.csv
183.66 KB
-
WIWA.Genaro_samples107.Master_meta.txt
13.19 KB
-
WIWA107.FinalPanel.GP1_3.fixed.rubias_input.txt
46.81 KB
Abstract
In the current study, we investigate in greater detail the potential for among-population differentiation in climate use and preferences through the annual cycle. Our aims were to: (i) analyze variation in the realized climatic niche among the six genetically distinct populations of Cardellina pusilla, and (ii) determine whether the populations within the species have a specific niche-following or niche-switching migratory behavior, in order to understand how this could shape the migratory connectivity and population trends of each population.
Dataset DOI: 10.5061/dryad.4j0zpc8qx
Description of the data and file structure
The samples that correspond to the winter time were genotyped using SNPtype Assays (Fluidigm Inc.) on a FluidigmTM 96.96 IFC controller using a panel of 96 loci that Ruegg et al. (2014) determined to be strongly diagnostic of distinct genetic units. We then used the EP1 Array Reader and Fluidigm’s automated Genotyping Analysis Software (Fluidigm Inc.) to call alleles with a confidence threshold of 90 %. Each genotype was also visually inspected for potential irregularities, manually called, and uncertain genotype calls were removed from the analysis. Samples with missing genotypes at > 10 % of SNP assays were removed from the analyses. Each nonbreeding individual sampled was assigned to the known C. pusilla’s genetically distinct populations recovered by Ruegg et al. (2014) using the R package rubias (Moran & Anderson, 2018). This is a Bayesian hierarchical genetic identification approach that accounts for population structure and differences in the number of populations grouped as genetically differentiated. We consider a robust assignment as > 0.8 posterior probability of assignment to the inferred collection. The six genetically distinct populations were Coastal California (CC), California Sierra (CS), Pacific Northwest (PNW), Western Boreal (WB), Basin Rockies (BR) and Eastern Boreal (EB), following Ruegg et al. (2020).
The assigned nonbreeding genetic samples were then combined with additional assigned nonbreeding samples from Ruegg et al. (2014) and we explored the relationships between the environmental conditions on the breeding and nonbreeding grounds of the six genetically distinct breeding populations, by modeling their realized climatic niche.
Files and variables
File: WIWA.Genaro_samples107.Master_meta.txt
Description: This file contains the metadata for the newly SNP genotyped individuals
Variables
- BGP_ID: This is the sample name
- Collection_Number: The museum ID corresponding to sample name
- Collection_Date: The date a genetic sample was collected in the field
- Country: The country the sample was collected from
- Sample_Type: The genetic material DNA was extracted from (feather, toe-pad, blood, etc)
- Species: 4 code abbreviation for Wilson's Warbler (WIWA)
- State: The State/Province the nonbreeding bird was sampled in
- NearTown: The name of near town the nonbreeding bird was sampled in
- Latitude
- Longitude
- Sex: if known, sex of the bird in indicated (M=male, F=female, U= Undetermined,
- DAY: The day sample was collected (duplicate information in Collection_Date)
- MONTH: The month sample was collected (duplicate information in Collection_Date)
- YEAR: The year the sample was collected (duplicate information in Collection_Date)
- Stage: All samples were collected during the nonbreeding months (W=winter)
- Popassignment: Genetic assignment of nonbreeding birds to one of 6 Wilson's warblers distinct genetic breeding groups. If posterior probability of assignment was > 0.8, the breeding group is noted, and if it was < 0.8, assignment was denoted as uncertain and that individual was not used in subsequent analyses.
File: WIWA107.FinalPanel.GP1_3.fixed.rubias_input.txt
Description: The file includes the SNP genotypes for 107 nonbreeding birds for 96 SNP-type Fluidigm assays designed by Ruegg et al. (2014) and genotyped in this study. The format is specific for the R software package, rubias (Moran & Anderson, 2018).
Variables
- sample_type: In rubias formatting, all unknown samples, or in this case, nonbreeding birds, are mixture sample type (not reference sample type).
- repunit: Designated as NA, because as nonbreeding birds, we do not know the breeding origin (e.g. reporting unit or repunit), and want to assign them back to breeding origin
- collection: While nonbreeding birds were sampled from many locations , the collection can simply be designated as mixture as we want to determine mixture proportion from distinct genetic clusters without any prior
- indiv: Sample name associated with genotypes
- Columns AB_AK_02.1-SW_PRBO_4.2: The remaining columns are the genotypes, 2 columns per named assay ("$assay_name".1 = first allele, and "$assay_name".2 = second allele). The numbers in the columns refer to nucleotides (1 = A, 2 = C, 3 = G, 4 = T, and NA is missing data).
File: WIWA.env_vals.breed_nonbreeding.combined.csv
Description: For niche calculations, we extracted WorldClim historic climate data from the monthly time series of precipitation and temperature that spans from 1960 to 2018 with a spatial resolution of 2.5 arc minutes using R software.
Variables
- species: Scientific name of Wilson's warbler
- lon: Longitude
- lat: Latitude
- GeneticCluster: This is identical to the Popassignment column in the meta data. However this file includes breeding and nonbreeding genetic samples
- Stage: Breeding and nonbreeding
- precip: Average mean precipitation extracted in R from the Latitude/Longitude during breeding season months (June, July) or nonbreeding season months (November, December, January and February).
- tmin: Average temperature minimum extracted in R from the Latitude/Longitude during breeding season months (June, July) or nonbreeding season months (November, December, January and February).
- tmax: Average temperature max extracted in R from the Latitude/Longitude during breeding season months (June, July) or nonbreeding season months (November, December, January and February).
File: wiwa-96snp-genotypes.breeding.rubias_input.txt
Description: The file includes the SNP genotypes for 407 breeding birds for 96 SNP-type Fluidigm assays designed by Ruegg et al. (2014) and genotyped in the 2014 study. The format is specific for the R software package, rubias (Moran & Anderson, 2018). This data is provided with dryad submission and Github page associated with Ruegg et al. 2014, but included here as well.
Variables
- sample_type: Breeding birds are defined as reference birds in sample_type
- repunit: refers to one of the six distinct breeding units delineated in Ruegg et al. 2020 (Coastal California: CoastalCA, BasinRockies: RockyMtn, Eastern Boreal: Eastern, California Sierras: Sierra, Pacific Northwest: PacNorthwest, and Western Boreal).
- collection: The NearTown the sample was collected at
- indiv: Sample name associated with genotypes
- Columns AB_AK_02.1-SW_PRBO_4.2: The remaining columns are the genotypes, 2 columns per named assay ("$assay_name".1 = first allele, and "$assay_name".2 = second allele). The numbers in the columns refer to nucleotides (1 = A, 2 = C, 3 = G, 4 = T, and NA is missing data).
- File: wiwa-96snp-genotypes.breeding.rubias_input.txt
Files: WIWA_occurrences_and_environmental_data.zip (Supplementary data submitted to Zenodo)
Description: The folder includes occurrence data collected from eBird (Imani et al. 2025) records that we downloaded from the Global Biodiversity Information Facility, GBIF, (GBIF.org, 2022) in a folder named Occ, and data for three environmental variables (precipitation, temperature max and temperature min) in a folder named Env_vals for each distinct breeding group on breeding range and nonbreeding range, and the species as a whole (abbreviated WIWA or capu). Wilson's warbler's distinct breeding groups are abbreviated in the filenames as follows:
akal= Western Boreal, coastal= Coastal California, eastern=Eastern Boreal, pnw= Pacific Northwest, rocky=Basin Rockies, sierra=California Sierras
Wilson's warbler is abbreviated in filenames as WIWA or capu.
GBIF.org (23 February 2022) GBIF Occurrence Download https://doi.org/10.15468/dl.my9axt
Imani J, Audette C, Auer T, Barker S, Barry J, Charnoky M, Crowley C, Curtis J, Davies I, Davis C, Diaz R, Feinberg A, Fink D, Ganger J, Garrett J, Gerbracht J, Hanks C, Hayes M, Hochachka W, Iliff M, Jordan A, Ligocki S, Long T, Morris W, Morrow S, Oldham L, Padilla Obregon F, Robinson O, Rodewald A, Ruiz-Gutierrez V, Schloss M, Smith A, Smith J, Stillman A, Stokowski M, Strimas-Mackey M, Sullivan B, Tedeschi A, Weber D, Wolf H, Wood C (2025). EOD – eBird Observation Dataset. Cornell Lab of Ornithology. Occurrence dataset https://doi.org/10.15468/aomfnb accessed via GBIF.org on 2026-02-20.
Variables in Occurence data
- species: Cardellina pusilla
- lon: Longitude of occurrence
- lat: Latitude of occurrence
Variables in Env_vals data: Each file is labeled with the environmental variable extracted from WorldClim. The files include the row number in the first column, and the mean value of the environmental variable in the second column the breeding (BR) or nonbreeding (NB) months.
- prec: Precipitation (mm)
- tmax: Temperature max (°C)
-
tmin: Tempreature min (°C)
Files: 01-capu_breeding_pop_range.Rmd
Description: This RMarkdown script processes and analyzes Wilson's Warbler (Cardellina pusilla) presence data during the breeding season. The code reads in manually downloaded and cleans GBIF occurrences (GBIF.org, 2022), filters data for breeding season (June-July), performs spatial coordinate cleaning, assigns occurrences to genetically distinct populations and generates distribution maps for each population.
GBIF.org (23 February 2022) GBIF Occurrence Download https://doi.org/10.15468/dl.my9axt
REQUIRED INPUT FILES:
1. ./Gbif_data/wiwa_gbif_data.csv (raw GBIF data)
2. WIWA_breeding_BGP_MZFC.csv (BGP data)
3. Shapefiles/WIWA_eBird/wiwa_eBird_range/wiwa.breeding_season.sf.WGS84.Ebird.shp
4. Shapefiles/WIWA_eBird/coastal_breeding_range.shp
5. Shapefiles/WIWA_eBird/WIWA_breednodes_eBird/WIWA.genoscape_brick_2.shp
6. Shapefiles/WIWA_eBird/rocky_breeding_range.shp
7. Shapefiles/WIWA_eBird/akal_breeding_range.shp
8. Shapefiles/WIWA_eBird/sierra_breeding_range.shp
9. Shapefiles/WIWA_eBird/Eastern_complete_breeding_range.shp
Files: 02-capu_eBird_range_map.Rmd
Description: This script generates seasonal range maps for Wilson's Warbler (Cardellina pusilla) using eBird Status and Trends data. The code downloads relative abundance data from eBird (eBird 2022), processes data by seasons (breeding, nonbreeding, migration), generates seasonal range polygons, and exports shapefiles for subsequent spatial analysis.
eBird. 2022. eBird: An online database of bird distribution and abundance [web application]. eBird, Cornell Lab of Ornithology, Ithaca, New York. Available: http://www.ebird.org. (Accessed: March 18, 2022).
EBIRD ACCESS KEY:
- Requires personal eBird Status and Trends access key
- Configure with: ebirdst::set_ebirdst_access_key("YOUR_KEY_HERE", overwrite = FALSE)
- The key in the code (bd4n6foc5985) is an example and needs to be replaced
INPUT FILES:
- Downloaded eBird Status and Trends data for "wlswar" (Wilson's Warbler)
- rnaturalearth shapefiles for map backgrounds
Files: 03-capu_winter_pop_range.Rmd
Description: This script processes and analyzes Wilson's Warbler (Cardellina pusilla) presence data during the winter season. The code cleans GBIF occurrences for winter months (Nov-Feb) (GBIF.org 2022), combines GBIF data with BGP (Birds Genoscape Project) data, filters occurrences within eBird winter range, generates distribution maps for each genetically distinct population, prepares data for ecological niche modeling.
GBIF.org (23 February 2022) GBIF Occurrence Download https://doi.org/10.15468/dl.my9axt
REQUIRED INPUT FILES:
1. WIWA_winter_BGP_MZFC.csv (BGP winter data)
2. ./Gbif_data/wiwa_gbif_data.csv (raw GBIF data)
3. ./Shapefiles/WIWA_eBird/wiwa_eBird_range/wiwa.nonbreed_season.sf.WGS84.Ebird.shp
4. Population-specific CSV files (previously generated):
* akal_winter.csv
* coastal_winter.csv
* pnw_winter.csv
* sierra_winter.csv
* rocky_winter.csv
* eastern_winter.csv
Files: 04-climatic_niches.R
Description: This R script performs comprehensive climatic niche analysis for Wilson's Warbler (Cardellina pusilla) populations across different genetic groups during breeding and wintering seasons. The analysis extracts climate variables from WorldClim data, calculates niche metrics, and assesses seasonal niche overlap between populations as part of a broader research project on the species' ecology and evolution.
REQUIRED INPUT FILES:
1. Occurrence Data:
- ./Occ_data/capu_breeding_GBIF_BGP_7kocc.csv (WIWA breeding occurrences)
- ./Occ_data/capu_wintering_GBIF_BGP_5kocc.csv (WIWA wintering occurrences)
- Population-specific occurrence files for 6 genetic populations
2. Climate Data (WorldClim):
- Breeding season (June-July): tmax, tmin, prec
- Wintering season (November-February): tmax, tmin, prec
- Located in ./Amb_data/WorldClim/
3. Shapefiles:
- Population breeding ranges
- WIWA eBird ranges
Files: 05-Capu_manuscript_figures.R
Description: This R script generates comprehensive publication-quality figures for a manuscript on Wilson's Warbler (Cardellina pusilla) genetic populations. The code creates a main genoscape map with breeding and wintering distributions, generates kernel density plots for realized climatic niches, incorporates genetic population assignments with distinct colors, handles reviewer comments by differentiating sampling cohorts and produces final figures suitable for publication.
REQUIRED INPUT FILES:
1. Genetic population raster files (TIFF format):
* Multiple .tif files in: D:/Documentos/UNAM/Maestria/Proyecto/Metadata/ENM_Capu/Scape2Shape/rewiwashapefiles/
2. Occurrence data for all populations (CSV format):
* akal_breeding_final.csv, akal_wintering.csv
* coastal_breeding_final.csv, coastal_wintering.csv
* eastern_breeding_final.csv, eastern_wintering.csv
* pnw_breeding_final.csv, pnw_wintering.csv
* rocky_breeding_final.csv, rocky_wintering.csv
* sierra_breeding_final.csv, sierra_wintering.csv
3. Shapefiles:
* Eastern_complete_breeding_range.shp
* wiwa.nonbreed_season.sf.WGS84.Ebird.shp (wintering range)
4. RDS objects (kernel density estimations):
* 16 RDS files in ./R_objects_needed/ for niche density plots
Code/software
Rubias (Moran & Anderson, 2018) is an R software program that probabilistically assigns individuals to specific reporting units (i.e. genetic clusters diagnosed in the breeding region). Originally used for fish stock identification, we've co-opted it for bird genoscapes. We used this software specifically for genetic assignment of wintering birds to breeding origin to subsequently define wintering regions of Wilson's warbler genetic groups.
Access information
Other publicly accessible locations of the data:
Changes after Jan 4, 2026: The data, added as a supplemental zipped file, to Zenodo include eBird occurrence data accessed via GBIF and environmental data used to estimate the breeding and nonbreeding climatic niches of each of Wilson's Warbler's distinct breeding groups. Here on Dryad, we also included 5 scripts to estimate climatic niches and create the Manuscript figures. Previously published data in this repository included the genetic data for assignment of additional wintering birds to breeding population of origin, used to define the nonbreeding range necessary to calculate nonbreeding climatic niche.
