Independent avian epigenetic clocks for aging and development
Data files
Feb 07, 2025 version files 49.89 MB
-
barcode_files_epiGBS2_2018_B1-B8.xlsx
28.24 KB
-
barcode_files_epiGBS2_2018_B9-B10.xlsx
13.08 KB
-
barcode_files_epiGBS2_2019_B13-B18.xlsx
22.80 KB
-
barcode_files_epiGBS2_2020_B19-B23.xlsx
20.40 KB
-
Extern_MethData_Brood_Test_DevelopClock.txt
193.06 KB
-
Extern_MethData_Train_DevelopClock.txt
8.29 MB
-
Extern_Samples_AgeingClock.txt
11.42 KB
-
Extern_Samples_DevelopClock.txt
92.18 KB
-
MethylationData_AgeingClock.txt
41.21 MB
-
README.md
4.43 KB
Abstract
Information on individual age is a fundamental aspect in many ecological and evolutionary studies. However, accurate and non-lethal methods that can be applied to estimate the age of wild animals are often absent. Furthermore, since the process of aging is accompanied by a physical decline and the deterioration of biological functions, the biological age often deviates from the chronological age. Epigenetic marks are widely suggested to be associated with this age-related physical decline, and especially changes in DNA methylation are suggested to be reliable age-predictive biomarkers. Here, we developed separate epigenetic clocks for aging and development in a small passerine bird, the great tit (Parus major). The aging clock was constructed and evaluated using erythrocyte DNA methylation data of 122 post-fledging individuals, and the developmental clock using 67 pre-fledging individuals from a wild population. Using a leave-one-out cross validation approach, we were able to accurately predict the ages of individuals with median absolute deviations of 0.40 years for the aging and 1.06 days for the development clock. Moreover, using existing data from a brood-size manipulation we show that nestlings from reduced broods are estimated to be biologically older compared to control nestlings, while they are expected to have higher fitness. These epigenetic clocks provide further evidence that, as observed in mammals, changes in DNA methylation of certain CpG sites are highly correlated with chronological age in birds and this opens up new avenues for broad applications in behavioural and evolutionary ecology.
Description of data and file structure
- "barcode_files_epiGBS2_2018_B1-B8", "barcode_files_epiGBS2_2018_B9-B10", "barcode_files_epiGBS2_2019_B13-B18" & "barcode_files_epiGBS2_2020_B19-B23" includes the unique barcode combinations needed to demultiplex the raw DNA methylation (epiGBS2) data on NCBI (see below). Column name explanation: Flowcell = sequencing flowcell; Lane = sequencing lane; Barcode_R1 = barcode ligated to the 5 prime – 3 prime DNA strands (i.e. R1 or BA); Barcode_R2 = barcode ligated to the 3 prime – 5 prime DNA strands (i.e. R2 or CO); Sample = individual; ENZ_R1 = restriction enzyme; ENZ_R2 = restriction enzyme; Wobble_R1 = length (in nucleotides) of the random nucleotide sequence (i.e. UMI) in the adapter of the 5 prime – 3 prime DNA strands; Wobble_R2 = length (in nucleotides) of the random nucleotide sequence (i.e. UMI) in the adapter of the 3 prime – 5 prime DNA strands; Species = study species. The year in the file name stands for the year of the sample collection (i.e. experimental study).
- "Extern_Samples_DevelopClock.txt" includes the information on the pre-fledging samples. Column Explanation: Sample_ID = contains unique IDs for eachs sample, RNR = contains unique ID for each individual, Age_in_Days = Chronological age in Days, Mol_sex = sex of the pre-fledgings, Experiment = experiment they were used in, Treatment = the treatment within each experiment, Filename = the methylation call file name, nr_cpg = If the file contained more or less than 3m unique rows, Genetic_ID = the brood of origin of individuals from a brood size manipulation experiment.
- "Extern_MethData_Train_DevelopClock.txt" includes the Sample_IDs of pre-fledging samples and the corresponding methylation ratios of 11 thousand CpGs that were used to train the Developmental Clock. Column Explanation: Sample_ID = contain unique ID for eachs sample, the remaining columns represent different CpG sites.
- "Extern_MethData_Brood_Test_DevelopClock.txt" includes the methylation ratio of the 115 CpGs that were used in the final developmental clock for 102 samples that were part of the brood size manipulation experiment. Column Explanation: each column represent an unique ID for each sample, each row represent one of the 115 CpGs.
- "Extern_Samples_AgeingClock.txt" includes the information of the post-fledging samples. Column Explanation: RNR = unique ID for each individual, Age_in_Days = Chronological age in Days, Mol_sex = sex of the post-fledgings, Filename = the methylation call file name.
- "MethylationData_AgeingClock.txt" includes the methylation ratios of 35 thousand CpG sites for 122 post-fledging individuals that was used to train the Ageing Clock. Column Explanation: CpG = the name of each unique CpG, the remaining columns represent the unique IDs for each individual with the corresponding methylation ratio.
Any cells that contain "NA" represent missing values or the column does not apply to that individual.
- "MethylationCalling_DevelopClock.R" includes the script to filter the DNA methylation (epiGBS2) data for the pre-fledgings, results in "Extern_MethData_Train_DevelopClock.txt" datasheet that was used to train the Development Clock model.
- "MethylationCalling_AgeingClock.R" includes the script to fiter the DNA methylation (epiGBS2) data for the post-fledgings, results in "MethylationData_AgeingClock.txt" datasheet that was used to train the Ageing Clock model.
- "Extern_Clock_Analysis.pdf" contains the codes used to train and test both the epigenetic clock models.
- The raw genomic epiGBS2 reads are available on NCBI under BioProject PRJNA208335 under the SRA accessions SRX18523280, SRX18523281, SRX18523282, SRX18523283, SRX18523284, SRX18523285, SRX18523286, SRX18523287, SRX21758799, SRX21758800, SRX22027777, SRX22027778, SRX22027779, SRX22027780, SRX22027781, SRX26112229, SRX26112230, SRX26112231, SRX26112232, SRX26112233 and SRX26112234.
For more information on data collection and analysis, please refer to the published manuscript and the Supplementary Material. Also, please feel free to contact Kees van Oers (k.vanoers@nioo.knaw.nl) if you have any further questions or comments.
The epiGBS2 bioinformatics pipeline can be accessed on github (https://github.com/nioo-knaw/epiGBS2).
