Data from: Temporal dynamics of selection on early-life phenotypic plasticity in seasonal migration versus residence
Data files
Dec 04, 2025 version files 7.85 GB
-
crh_age_v2.txt
292.87 KB
-
crh_v1.txt
258.94 KB
-
MS_CMR_Age_output.RData
7.61 GB
-
MS_CMR_output.RData
79.36 MB
-
MS_SimNull1_output.RData
78.98 MB
-
MS_SimNull2_output.RData
78.98 MB
-
README.md
4.93 KB
-
SimulatedData_seed1.RData
1.31 MB
-
SimulatedData_seed2.RData
1.31 MB
Abstract
In this study, we quantified selection on early-life plasticity in the ecologically critical trait of seasonal migration versus residence, by fitting a novel multi-state model to spatio-seasonal resighting data from 13 newly-fledged cohorts of partially migratory European shags (Gulosus aristotelis).
To obtain the required data, during the 2010-2022 breeding seasons (April-August; 13 cohorts), all breeding attempts on IoM were monitored, and >95% of fledglings were marked with uniquely coded metal and colour rings. During the 2010-2024 non-breeding seasons, we undertook regular (approximately biweekly) resighting surveys on IoM (and adjacent day roosts) to detect current residents, and at core roost sites spanning the north-east UK coast (predominantly ca. 100-500 km from IoM) to detect current migrants.
We formulated individual encounter histories as occasion-specific summaries of resightings of 10,788 colour-ringed shags fledged during 2010-2022.
We defined five primary temporal ‘occasions’ spanning the natal breeding season (June-July, when chicks are typically ringed before fledging) to the following March (~8 months post-fledging, Figure 2), and an 'ever after' occasion.
The model outputs, i.e. posterior samples of the model parameters and derived parameters, are the primary results of our analyses.
Dataset DOI: 10.5061/dryad.sj3tx96h8
Description of the data and file structure
In this study, capture-recapture data was analysed using a multi-state capture-mark-recapture (CMR) Bayesian statistical model coded in Stan language and run in R using package rstan.
Files and variables
File: crh_v1.txt
Description: Data file with encounter histories for all individuals (used for main model analyses)
Variables
- BirdID: identity of the individual
- 1: resighting history of the individual in occasion 1, with cell values 1 to 4 representing the four observation events
- 2: resighting history of the individual in occasion 2, with cell values 1 to 4 representing the four observation events
- 3: resighting history of the individual in occasion 3, with cell values 1 to 4 representing the four observation events
- 4: resighting history of the individual in occasion 4, with cell values 1 to 4 representing the four observation events
- 5: resighting history of the individual in occasion 5, with cell values 1 to 4 representing the four observation events
- 6: resighting history of the individual in occasion 6, with cell values 1 to 4 representing the four observation events
- HatchYear: natal year i.e. cohort of the individual
File: crh_age_v2.txt
Description: Data file with encounter histories for all individuals and additional column with age information (used for supplemental analyses)
Variables
- BirdID: identity of the individual
- 1: resighting history of the individual in occasion 1, with cell values 1 to 4 representing the four observation events
- 2: resighting history of the individual in occasion 2, with cell values 1 to 4 representing the four observation events
- 3: resighting history of the individual in occasion 3, with cell values 1 to 4 representing the four observation events
- 4: resighting history of the individual in occasion 4, with cell values 1 to 4 representing the four observation events
- 5: resighting history of the individual in occasion 5, with cell values 1 to 4 representing the four observation events
- 6: resighting history of the individual in occasion 6, with cell values 1 to 4 representing the four observation events
- HatchYear: natal year i.e. cohort of the individual
- HatchDate_distance: represents the age (in days) of each individual on August 15th of their natal year (i.e. number of days between the hatching date of the individuals' brood and August 15th of the same year)
File: SimulatedData_seed1.RData
Description: simulated encounter history dataset 1 used to run confirmatory models
File: SimulatedData_seed2.RData
Description: simulated encounter history dataset 2 used to run confirmatory models
File: MS_CMR_output.RData
Description: Output from the main multi-state CMR memory model using 'crh_v1' as data
File: MS_CMR_Age_output.RData
Description: Output from the multi-state CMR memory model accounting for age using 'crh_age_v2' as data (Warning: the file has 7.61 GB)
File: MS_SimNull1_output.RData
Description: Output from the multi-state CMR memory model using 'SimulatedData_seed1' as data
File: MS_SimNull2_output.RData
Description: Output from the multi-state CMR memory model using 'SimulatedData_seed2' as data
Code/software
All code is archived in the Zenodo repository linked to this Dryad entry.
The Bayesian MS-CMR model was implemented in Stan v. 2.26.1, using package rstan v. 2.26.13 in R 4.2.2. We ran n=4 chains each comprising 1,000 warm-up and 2,000 monitored iterations, yielding 8,000 posterior samples for inference.
NB: Full understanding of these files requires careful reading of the paper and supplementary materials.
--------------------------The following files include the Stan models used for the analysis--------------------------------
- Main multi-state CMR memory model (used for main inference in the paper): MS_CMR_Memory.stan
- Multi-state CMR memory model accounting for age effects (used as post-hoc model in supporting information): MS_CMR_Memory_AgeEffect.stan
- Stan script to simulate encounter histories given simulated state-transition and observation probability matrices: simulate_data_stan_v2.stan
--------------------------The following files include the R scripts used to run the Stan models---------------------------
- R script to run all Stan models (i.e. main model using real data, main model using simulated datasets, and model accounting for age effects): Run_Stan_models.R
- R script to reproduce main and supporting figures, and all result parameters and tables: Results_Figures.R
- R script to define probabilities and simulate encounter histories using the stan script: Simulate_EH.R
Data collection
A partially migratory shag population breeding on Isle of May National Nature Reserve (hereafter ‘IoM’, Scotland, 56°11′N, 2°33′W) provides a highly relevant and tractable system to quantify the temporal dynamics of survival selection on early-life plasticity in migration versus residence.
To obtain the required data, during the 2010-2022 breeding seasons (April-August; 13 cohorts), all breeding attempts on IoM were monitored, and >95% of fledglings were marked with uniquely coded metal and colour rings, field-readable from ≤150m with a telescope (533–1064 ringed individuals/year, mean=818).
Since shags return to shore daily, marked individuals can be observed at coastal roost sites throughout the year, allowing direct observation of individuals’ current locations, and hence current resident or migrant status.
Accordingly, during the 2010-2024 non-breeding seasons, we undertook regular (approximately biweekly) resighting surveys on IoM (and adjacent day roosts) to detect current residents (defined as individuals roosting on IoM at night), and at core roost sites spanning the north-east UK coast (predominantly ca. 100-500 km from IoM) to detect current migrants (defined as not returning to IoM at night; ESM A1). These migrant sites are reachable within 1-2 days by juvenile shags, and encompass their main winter range. Ad hoc resightings at other sites (spanning ca. 800km) were also collected, including citizen science contributions (ESM A1).
Overall, this generated a dataset of 32,376 first-year resightings spanning 10,788 colour-ringed shags fledged during 2010-2022.
Model analyses
The Bayesian MS-CMR model was implemented in Stan v. 2.26.1, using package rstan v. 2.26.13 in R 4.2.2. We ran n=4 chains each comprising 1,000 warm-up and 2,000 monitored iterations, yielding 8,000 posterior samples for inference.
