Genetic diversity and the implications of captive rearing for a small population of Black‐tailed Godwits
Data files
Mar 12, 2025 version files 201.21 KB
-
GenoptypeTableUK.xlsx
15.16 KB
-
Genotypes_Godwits_UK.csv
93.68 KB
-
GenotypeTableNetherlands_Trimbos.xlsx
16.97 KB
-
OneSibperNest.csv
402 B
-
README.md
12.63 KB
-
Samples_metadata_Godwits_UK.xlsx
50.84 KB
-
Table2.xlsx
11.53 KB
Abstract
Headstarting is a captive rearing intervention where eggs are taken from the wild, artificially incubated and chicks are reared in captivity to fledging before being released into the wild. From an imperiled Black-tailed Godwit population in the UK, clutches have been collected for headstarting. While this conservation measure has reduced local extinction risk, it may have impacts on genetic diversity and population viability, especially when wild-sourced eggs must be collected from a small population. Here, we compare genetic diversity and relatedness of the UK population of 42 pairs with the much larger breeding population in the Netherlands (~30,000 pairs). We found that levels of heterozygosity and inbreeding are not currently compromised, but allelic richness in the UK population was 8.5% lower than in the Dutch population, and relatedness estimates suggest that 6.1% of the individuals in the UK population are closely related, at the level of half sibling and up, compared to 1.9% in the Dutch population. Increasing levels of relatedness could in the future deplete genetic variation further in the absence of immigration or wild-sourced eggs from other populations.
https://doi.org/10.5061/dryad.nvx0k6f37
Description of the data and file structure
Description of dataset
This dataset was generated to compare genetic diversity and relatedness of the UK breeding population of black-tailed godwits (42 pairs) with the much larger breeding population in the Netherlands (~30,000_ pairs).
Individuals were genotyped at 9 loci microsatellite loci. The UK dataset consisted of 59 individuals, the Netherlands dataset of 38 individuals.
METHODOLOGICAL INFORMATION
Sampling: Individual eggshell and its membrane were air dried. Of 9 chicks that died of naturla causes, a leg was removed and stored in ethanol
DNA extraction: salt extraction protocol modified from Richardson et al. (2001)
PCR: microsatellite loci were amplified using a 2µl PCR technique (Kenta et al., 2008), using primers from Verkuil et al. (2009), tagged with FAM or HEX dyes and arranged into multiplex reactions
Genotyping: on an ABI 3730 sequencer
Allele sizes calling: Gene Mapper Version 3.7 (Applied Biosystems)
Hardy-Weinberg equilibrium (HWE) test: chi-squared tests (UK dataset) and GENEPOP (Raymond and & Rousset, 1995; Netherlands dataset, see Trimbos et al. 2011)
Genetic diversity summary statistics for UK and The Netherlands: allelic richness (Na), rarefacted allelic richness (Ra), expected and observed heterozygosity (Ho and He), and the inbreeding coefficient (Fis); in Pegas 1.2 (Paradis, 2010)
Ritland estimator (Ritland, 1996): pairwise relatedness between all individuals, using the related package (version 1.0) in R (Pew et al., 2015)
Wilcoxon tests: Statistical differences in genetic diversity and relatedness values between countries
Kruskal-Wallis test: Statistical differences in relatedness between pairs with zero, one or two individuals that recruited into the UK breeding population between 2019-2023
The .r codes have been uploaded and can be accessed on Zenoda: Godwit microsatellite analysis.r, aux_functions.R
FILE LIST
Genotypes_Godwits_UK.csv
Samples_metadata_Godwits_UK.xlsx
Genotypes_Godwits_Netherlands_Trimbos.xlsx
GenoptypeTableUK.xlsx
OneSibperNest.csv
Table2.xls
aux_functions.R
Godwit microsatellite analysis.r
Files and variables
FILE 1
DATA-SPECIFIC INFORMATION FOR: Genotypes_Godwits_UK.csv
1. Number of variables: 22
2. Number of cases/rows: 789
Row 1: variable names
Row 2-789: 788 cases
3. Variable List:
- Sample File (fasta with allele call; value = [ID].fsa)
- Sample Name (value = sample DNA extraction ID)
- Run Name (plate name for sequencing run; value = Plate [#])
- Panel (Panel is group of markers; value = panel ID)
- Marker (value = name of PCR primer set)
- Dye (value = color label of marker for sequencing)
- Allele 1 (value = allele of microsatellite locus)
- Allele 2 (value = allele of microsatellite locus)
- Size 1 (value = exact length of allele 1)
- Size 2 (value = exact length of allele 2)
- Height 1 (value = height of the peak indicating allele 1)
- Height 2 (value = height of the peak indicating allele 2)
- Peak Area 1 (value = width of the peak indicating allele 1)
- Peak Area 2 (value = width of the peak indicating allele 2)
- Data Point 1 (value = size standard allele 1)
- Data Point 2 (value = size standard allele 2)
- AE (value = Process Quality Values (PQV))
- OS (value = Process Quality Values (PQV))
- BIN (value = Process Quality Values (PQV))
- PHR (value = Process Quality Values (PQV))
- XTLK (value = Process Quality Values (PQV))
- GQ (value = overall Genotype Quality)
4. Missing data codes: ""
FILE 2 (three sheets):
DATA-SPECIFIC INFORMATION FOR: Samples_metadata_Godwits_UK.xlsx; sheet: Master
1. Number of variables: 18
2. Number of cases/rows: 261
Row 1: variable names
Row 2-261: 260 cases
3. Variable List:
- Year (value = collection year of sample)
- Sample type (value = description of sample type)
- Sample # (value = sample ID)
- Sample (value = sample process status)
- SUCCESSFUL Extractions-SampleID (value = sample ID)
- UNSUCCESSFUL Extractions-SampleID (value = sample ID)
- Location (value = storage location sample: UEA = University of East Anglia)
- Colour combo (value = colour ring ID for individual bird)
- Source location (nest location; value = area name)
- Source location FIELD (nest location; value = field ID)
- Nest # (value = nest ID)
- Egg # (value = egg ID)
- year:nest (value = variable of combined collection year and nest ID)
- year:nest:egg (value = variable of combined collection year and nest ID and egg ID)
- Wild or H-S (value = nest in wild Wild) or headstarted (H-S)
- Parent Male (value = colour ring ID father)
- Parent Female (value = colour ring ID mother)
- RECRUIT:STATUS (recruited in population; value = RECRUITED/NOT-RECRUITED/NA)
4. Missing data codes: ""
DATA-SPECIFIC INFORMATION FOR: Samples_metadata_Godwits_UK.xlsx; sheet: Extractions
1. Number of variables: 3
2. Number of cases/rows: 213
Row 1: variable names
Row 2-213: 212 reactions
3. Variable List:
- Extraction (value = ID DNA extraction)
- SampleID (value = sample ID)
- Worked? (value = sample process status)
4. Missing data codes: ""
DATA-SPECIFIC INFORMATION FOR: Samples_metadata_Godwits_UK.xlsx; sheet: Extraction details
1. Number of variables: 9
2. Number of cases/rows: 37
Row 1: variable names
Row 2-37: 36 extractions
3. Variable List:
- Extraction Number (value = ID DNA extraction)
- Sample ID (value = sample ID)
- Ring# (value = metal ring ID for individual bird
- Colours (value = colour ring ID for individual bird)
- Source location (nest location, value = field ID)
- Nest (value = nest ID)
- Egg (value = egg ID)
- Wild or HS (nest type; value = nest in wild (Wild), or headstarted nest (HS))
- Parent Female (value = colour ring ID mother)
4. Missing data codes: ""
FILE 3 (two sheets):
DATA-SPECIFIC INFORMATION FOR: Genotypes_Godwits_Netherlands_Trimbos.xlsx; sheet: Sample info Godwit Ned
1. Number of variables: 12
2. Number of cases/rows: 39
Row 1: variable names
Row 2-39: 38 samples
3. Variable List:
- Sample Number (value = field sample ID)
- Ringnumber (value = metal ring ID for individual bird)
- Location ID (value = code assigned to location)
- Loc (nest location; value = area name)
- Nest ID (value = code assigned to nest)
- Age (age of chicks; value = days)
- Year of 1st Bloodsample (value = year bird was sampled)
- Remarks
- Min dis between samples km (minimum distance between sampling locations; value = kilometer (km))
- Max dis between samples km (maximum distance between sampling locations' value = kilometer (km))
- Longitude (value = longitude of nest location)
- Latitude (value = latitude of nest location)
4. Missing data codes: 99
DATA-SPECIFIC INFORMATION FOR: Genotypes_Godwits_Netherlands_Trimbos.xlsx; sheet: Gescoorde Msats Friesland
1. Number of variables: 20
2. Number of cases/rows: 39
Row 1: variable names
Row 2-39: 38 samples
3. Variable List:
- Sample Name (value = field sample ID)
- Popinfo (population location information; Friesland only; value = FRS)
- Lim10 (marker ID, allele 1; value = length in base pairs)
- Lim10-2 (marker ID, allele 2; value = length in base pairs)
- Lim11 (marker ID, allele 1; value = length in base pairs)
- Lim11-2 (marker ID, allele 2; value = length in base pairs)
- Lim12a (marker ID, allele 1; value = length in base pairs)
- Lim12a-2 (marker ID, allele 2; value = length in base pairs)
- Lim25 (marker ID, allele 1; value = length in base pairs)
- Lim25-2 (marker ID, allele 2; value = length in base pairs)
- Lim3 (marker ID, allele 1; value = length in base pairs)
- Lim3-2 (marker ID, allele 2; value = length in base pairs)
- Lim30 (marker ID, allele 1; value = length in base pairs)
- Lim30-2 (marker ID, allele 2; value = length in base pairs)
- Lim33 (marker ID, allele 1; value = length in base pairs)
- Lim33-2 (marker ID, allele 2; value = length in base pairs)
- Lim5 (marker ID, allele 1; value = length in base pairs)
- Lim5-2 (marker ID, allele 2; value = length in base pairs)
- Lim8 (marker ID, allele 1; value = length in base pairs)
- Lim8-2 (marker ID, allele 2; value = length in base pairs)
4. Missing data codes: 999
FILE 4:
DATA-SPECIFIC INFORMATION FOR: GenoptypeTableUK.xlsx
1. Number of variables: 22
2. Number of cases/rows: 60
Row 1: variable names
Row 2-60: 59 samples
3. Variable List:
- Sample Name (sample ID; value = extraction ID)
- PopInfo (population location information; UK only; value = 1)
- LIM10_Allele 1 (marker ID, allele 1; value = length in base pairs)
- LIM10_Allele 2 (marker ID, allele 2; value = length in base pairs)
- LIM11_Allele 1 (marker ID, allele 1; value = length in base pairs)
- LIM11_Allele 2 (marker ID, allele 2; value = length in base pairs)
- LIM12a_Allele 1 (marker ID, allele 1; value = length in base pairs)
- LIM12a_Allele 2 (marker ID, allele 2; value = length in base pairs)
- LIM22_Allele 1 (marker ID, allele 1; value = length in base pairs)
- LIM22_Allele 2 (marker ID, allele 2; value = length in base pairs)
- LIM25_Allele 1 (marker ID, allele 1; value = length in base pairs)
- LIM25_Allele 2 (marker ID, allele 2; value = length in base pairs)
- LIM3_Allele 1 (marker ID, allele 1; value = length in base pairs)
- LIM3_Allele 2 (marker ID, allele 2; value = length in base pairs)
- LIM30_Allele 1 (marker ID, allele 1; value = length in base pairs)
- LIM30_Allele 2 (marker ID, allele 2; value = length in base pairs)
- LIM33_Allele 1 (marker ID, allele 1; value = length in base pairs)
- LIM33_Allele 2 (marker ID, allele 2; value = length in base pairs)
- LIM5_Allele 1 (marker ID, allele 1; value = length in base pairs)
- LIM5_Allele 2 (marker ID, allele 2; value = length in base pairs)
- LIM8_Allele 1 (marker ID, allele 1; value = length in base pairs)
- LIM8_Allele 2 (marker ID, allele 2; value = length in base pairs)
4. Missing data codes: NA
FILE 5:
DATA-SPECIFIC INFORMATION FOR: OneSibperNest.csv
1. Number of variables: 2
2. Number of cases/rows: 60
Row 1: variable names
Row 2-60: 59 samples
3. Variable List:
- Sample Name (sample ID; value = extraction ID)
- InAnalyses (selected for analyses (no/yes; value = 0 or 1)
4. Missing data codes: no missing data
FILE 6:
DATA-SPECIFIC INFORMATION FOR: Table2.xls
1. Number of variables: 11
2. Number of cases/rows: 10
Row 1: variable names
Row 2-10: summary stats for 9 loci
3. Variable List:
- Locus (value = marker name)
- NaUK (value = number of alleles (Na) in UK sample)
- ArUK (value = number of alleles after rarefaction (Ar) in UK sample)
- HeUK (value = expected heterozygosity (He) in UK sample)
- HoUK (value = observed heterozygosity (Ho) in UK sample)
- FisUK (value = inbreeding coefficient (Fis) in UK sample)
- NaFRS (value = number of alleles (Na) in UK sample)
- ArFRS (value = number of alleles after rarefaction (Ar) in Friesland sample)
- HeFRS (value = expected heterozygosity (He) in Friesland sample)
- HoFRS (value = observed heterozygosity (Ho) in Friesland sample)
- FisFRS (value = inbreeding coefficient (Fis) in Friesland sample)
4. Missing data codes: no missing data
FILE 7:
DATA-SPECIFIC INFORMATION FOR: aux_functions.R
R script
FILE 8:
DATA-SPECIFIC INFORMATION FOR: Godwit microsatellite analysis.r
R script
Code/software
SOFTWARE
- Gene Mapper Version 3.7 (Applied Biosystems)
- R version 4.0.2 'related' package (version 1.0) (Pew et al., 2015)
- 'Pegas' package (version 1.2) (Paradis, 2010)
- 'adegenet' package (in R version 4.0.2)
- 'ape' package (in R version 4.0.2)
- GENEPOP (Raymond and & Rousset, 1995; Netherlands dataset, see Trimbos et al. 2011).
CODE
SEE FILE 7: aux_functions.R
SEE FILE 8: Godwit microsatellite analysis.r
Access information
Other publicly accessible locations of the data:
- NA
Data was derived from the following sources:
- NA
Standard genotying at multiple microsatellite loci.
