Analysis of lifespan across Diversity Outbred mouse studies identifies multiple longevity-associated loci
Data files
Apr 21, 2025 version files 17.96 GB
-
README.md
12.02 KB
-
SData_1_phenotypes.csv
146.03 KB
-
SData_2_harrison_apr.RData
3.10 GB
-
SData_3_harrison_phenos.txt
104.54 KB
-
SData_4_shock_apr.RData
1.58 GB
-
SData_5_shock_phenos.txt
47.01 KB
-
SData_6_dr_apr.RData
5.54 GB
-
SData_7_dr_phenos.txt
117.18 KB
-
SData_8_meta_apr.RData
7.74 GB
-
SData_9_meta_phenos.txt
96.01 KB
-
STable_1_logrank_contrasts_ad_lib_F.csv
316 B
-
STable_2_h2_by_group.txt
502 B
-
STable_3_QTL_summary.txt
960 B
-
STable_4_study_genes.txt
1.91 MB
-
STable_5_shock_Lifespan_chr16_pos6.671583xSex_test.txt
90 B
Abstract
Lifespan is an integrative phenotype whose genetic architecture is likely to highlight multiple processes with high impact on health and aging. Here, we conduct a genetic meta-analysis of longevity in Diversity Outbred (DO) mice that includes 2,444 animals from three independently conducted lifespan studies. We identify six loci that contribute significantly to lifespan independently of diet and drug treatment, one of which also influences lifespan in a sex-dependent manner, as well as an additional locus with a diet-specific effect on lifespan. Collectively, these loci explain over half of the estimated heritable variation in lifespan across these studies and provide insight into the genetic architecture of lifespan in DO mice.
https://doi.org/10.5061/dryad.pnvx0k6z8
Description of the data and file structure
Corresponding author: Gary Churchill, Jackson Laboratory - gary.churchill@jax.org
Lead author: Martin N. Mullis, Calico Life Sciences LLC - martinmullis91@gmail.com
Dataset Overview
This dataset contains the information required to replicate the analyses performed in the manuscript "Analysis of lifespan across Diversity Outbred mouse studies identifies multiple longevity-associated loci". The dataset includes lifespan (phenotypic) and allele probability (genetic) data for DO mice enrolled in one of three studies conducted by the Jackson Laboratory and one study conducted jointly by the Jackson Laboratory and Calico Life Sciences, LLC.
Studies, methods, and analysis are described in detail in an accompanying manuscript, soon to be uploaded to BioRxiv and submitted for publication.
Files and variables
File: SData_1_phenotypes.csv
Description: Lifespan and covariate data for the mice in each of the four studies. Each row contains data for a single animal.
Variables
- Study: the study the animal was enrolled in: ‘Shock’, ‘Harrison’, ‘Svenson’, or ‘DRiDO’ (Dietary Restriction).
- Mouse.ID: the unique identifier assigned to each animal.
- Sex: The sex of each animal; "F" for female and "M" for male. This is one of the primary covariates used in the analysis.
- Diet: The diet of each animal; diets are described in detail in the manuscript. This is one of the primary covariates used in the analysis.
- Generation: The DO mouse generation wave from which each mouse was enrolled. This is one of the primary covariates used in the analysis.
- Lifespan: The number of days each animal was enrolled in the respective study.
- Status: The status of each animal at the time of study exit. "0" means the animal was censored, and "1" means the animal died.
File: SData_3_harrison_phenos.txt
Description: Lifespan and covariate data for the Harrison study, reflecting the filtered set of phenotypes used for genetic analysis of this study.
Variables
- Study: The study each animal was enrolled in.
- Mouse.ID: The unique identifier assigned to each animal.
- Sex: The sex of each animal; "F" for female and "M" for male. This is one of the primary covariates used in the analysis.
- Diet: The diet of each animal; diets are described in detail in the manuscript. This is one of the primary covariates used in the analysis.
- Generation: The DO mouse generation wave from which each mouse was enrolled. This is one of the primary covariates used in the analysis.
- Birth.Date: The date of birth of each animal.
- Death.Date: The date of death of each animal.
- Death.Type: Type of death or cause for exiting the study ('E'). 'fd' - found dead. 'tailed'/'no tail' - whether the animal had a tail or not when found.
- Lifespan: The number of days each animal was enrolled in the respective study.
- Status: The status of each animal at the time of study exit. "0" means the animal was censored, and "1" means the animal died.
- Lifespan.Zsc: Z-score normalized lifespan data.
- Lifespan.Surv: The number of days each animal was enrolled in the respective study.
- coxphMR: mortality ratio from Cox proportional hazards model
- rankMR: mortality ratio from rank-based Cox fit
File: SData_5_shock_phenos.txt
Description: Lifespan and covariate data for the Shock study, reflecting the filtered set of phenotypes used for genetic analysis of this study.
Variables
- Study: The study each animal was enrolled in.
- Mouse.ID: The unique identifier assigned to each animal.
- Sex: The sex of each animal; "F" for female and "M" for male. This is one of the primary covariates used in the analysis.
- Diet: The diet of each animal; diets are described in detail in the manuscript. This is one of the primary covariates used in the analysis.
- Generation: The DO mouse generation wave from which each mouse was enrolled. This is one of the primary covariates used in the analysis.
- Birth.Date: The date of birth of each animal.
- Death.Date: The date of death of each animal.
- Death.Type: Type of death or cause for exiting the study ('E'). 'fd' - found dead. 'tailed'/'no tail' - whether the animal had a tail or not when found.
- Lifespan: The number of days each animal was enrolled in the respective study.
- Status: The status of each animal at the time of study exit. "0" means the animal was censored, and "1" means the animal died.
- Lifespan.Zsc: Z-score normalized lifespan data.
- Lifespan.Surv: The number of days each animal was enrolled in the respective study.
- coxphMR: mortality ratio from Cox proportional hazards model
- rankMR: mortality ratio from rank-based Cox fit
File: SData_7_dr_phenos.txt
Description: Lifespan and covariate data for the Dietary Restriction (‘DRiDO’) study, reflecting the filtered set of phenotypes used for genetic analysis of this study.
Variables
- Mouse.ID: The unique identifier assigned to each animal.
- Generation: The DO mouse generation wave from which each mouse was enrolled. This is one of the primary covariates used in the analysis.
- Cohort: The animal cohort; a combination of generation ('G') and week ('W') data.
- JobGroup: The job group an animal was in; a combination of generation ('G'), week ('W'), and day ('D') data.
- BWDay: Which day the animal was weighed on.
- HID: Additional identification number for the animal.
- LHID: Additional identification number for the animal.
- Diet: The diet of each animal; diets are described in detail in the manuscript. This is one of the primary covariates used in the analysis.
- EN: Ear notching of the mouse (2L1R = 2 right 1 left).
- Coat: The coat color of the mouse.
- Status: The status of each animal at the time of study exit. "0" means the animal was censored, and "1" means the animal died.
- DOB: Date of birth of the animal.
- DOE: Date of exit from the study due to death or censorship.
- COE: Cause of exit from the study.
- Lifespan: The number of days each animal was enrolled in the respective study.
- Died: TRUE/FALSE. Did the animal die or not? If FALSE, the animal was censored.
File: SData_9_meta_phenos.txt
Description: Lifespan and covariate data for the meta-analysis of the DRiDO, Harrison, and Shock studies, reflecting the filtered set of phenotypes used for the three relevant studies.
Variables
- Study: The study each animal was enrolled in.
- Mouse.ID: The unique identifier assigned to each animal.
- Generation: The DO mouse generation wave from which each mouse was enrolled. This is one of the primary covariates used in the analysis.
- Diet: The diet of each animal; diets are described in detail in the manuscript. This is one of the primary covariates used in the analysis.
- Status: The status of each animal at the time of study exit. "0" means the animal was censored, and "1" means the animal died.
- Lifespan: The number of days each animal was enrolled in the respective study.
File: STable_1_logrank_contrasts_ad_lib_F.csv
Description: Test statistics for log rank tests comparing lifespan among female mice fed an ad libitum diet by study.
Variables
- test: The pair of studies being tested.
- chi_sq: The ꭓ2 associated with each test.
- pval: The p-value associated with each test.
File: STable2_h2_by_group.txt
Description: Heritability estimates and standard errors for individual treatment groups in the Harrison, Shock, and DRiDO study.
Variables
- group: The particular group of animals. Consists of study, sex, and diet/intervention information in the format STUDY_SEX_INTERVENTION. Study can be one of either 'Shock', 'DR' (DRiDO), or 'Harrison'. Sex can be either 'F' (female) or 'M' (male). Diet and intervention information are not abbreviated.
- h2: Heritability of lifespan within the group.
- se_h2: The standard error of the heritability estimate.
File: STable_3_QTL_summary.txt
Description: List of QTL detected in the Harrison, Shock, and DRiDO studies, the meta analysis, and the GxEMM scans.
Variables
- Study: The particular dataset in which a QTL was detected.
- chr: The chromosome on which each QTL was detected.
- pos: The position of the lead variant at each QTL.
- lod: The LOD score of each QTL.
- ci_lo: The location of the marker denoting the 2LOD drop prior to the peak marker.
- ci_hi: The location of the marker denoting the 2LOD drop after the peak marker.
- threshold: The significance threshold used to call the peak marker of the QTL, which can be ‘Nominal’, ‘Permutation’, or ‘GxEMM’. ‘Nominal’ corresponds to a conservative threshold of p<= 110-6, ‘Permutation’ corresponds to an alpha critical threshold of <=0.05 based on 1,000 permutations of the data, and ‘GxEMM’ corresponds to the previously reported threshold of p <= 110-4.
File: STable_4_study_genes.txt
Description: Table of genes underlying each QTL.
Variables
- study: the study in which a particular QTL was mapped.
- qtl_id: the chromosome and position at which the QTL was mapped, separated by a “_” character.
- chr: the chromosome on which the QTL was mapped.
- source: the database from which gene annotation data was collected.
- type: the annotation associated with a marker (can be ‘gene’ or ‘pseudogene’).
- start: the first physical coordinates associated with the annotation in megabases (Mb).
- stop: the last physical coordinates associated with the annotation in megabases (Mb).
- strand: The strand of DNA the annotation is on
- ID: the database ID assigned to the annotation.
- Name: the gene name associated with the annotation.
- Dbxref: lists annotation IDs in other databases.
- gene_id: the gene ID in the corresponding database.
- mgi_type: the gene type associated with the annotation in the corresponding database.
- description: functional descriptions of the protein or RNA associated with the annotation.
File: STable_5_shock_Lifespan_chr16_pos6.671583xSex_test.txt
Description: Statistical test for interaction between chromosome 16 and sex in the Shock study.
Variables
- chr: The chromosome of the locus being tested.
- pos: The position of the locus being tested.
- int_lod: The LOD score of the single-QTL interaction test.
- int_thresh: The permutation threshold (1,000 permutations; alpha <= 0.05) of the single-QTL test.
- significant: Whether the LOD of the test surpasses the significance threshold (LOD > threshold).
File: SData_4_shock_apr.RData
Description: Eight-state allele probabilities from the Shock study. For use with R/qtl2.
File: SData_2_harrison_apr.RData
Description: Eight-state allele probabilities from the Harrison study. For use with R/qtl2.
File: SData_6_dr_apr.RData
Description: Eight-state allele probabilities from the DRiDO study. For use with R/qtl2.
File: SData_8_meta_apr.RData
Description: Eight-state allele probabilities from the Harrison, Shock, and DRiDO studies. Here, the probabilities are computed at a set of 69,005 pseudomarkers rather than real markers from a MUGA array. For use with R/qtl2.
Code/software
See methods of the associated manuscript for the software and versions used in this analysis.
Access information
Other publicly accessible locations of the data:
- Data from the DRiDO study (https://doi.org/10.1038/s41586-024-08026-3) can be found at the Jackson Laboratory QTL Viewer website: https://qtlviewer.jax.org/
Study Designs
Dietary Restriction (DRiDO):
This study is extensively described elsewhere (Di Francesco et al. 2023). Briefly, female DO mice were received at ~4 weeks of age in 12 waves from March 2016 through November 2017. Mice were housed in groups of 8 in single large-format ventilated pens with nestlets, biotubes, and gnawing blocks. Mice were randomized to one of five dietary interventions which were initiated for the surviving mice at 6 months of age: ad libitum (AL; n = 188), 1 day per week fasting (1D; n = 188), 2 days per week fasting (2D; n = 190), 20% caloric restriction (20; 2.75g/mouse/day; n = 189), and 40% caloric restriction (40; 2.06g/mouse/day; n = 182). Mice were extensively phenotyped as described (Di Francesco et al. 2023) and maintained until they died naturally. The mouse room was on a 12/12 hour light/dark schedule from 6:00 am to 6:00 pm and kept at 73o +/- 2o F.
Harrison:
Founder DO mice (167 retired breeder pairs) were obtained from the Jackson Laboratory and female offspring were accumulated for the lifespan study over 5 months. All mice were microchipped at 4 weeks of age. Mice were housed 22 per large-format double pens connected by a tunnel on an open-air rack. All 22 mice per pen were from different breeder pairs. Pens had pine shaving bedding with acidified water. Every week, one pen of the connected pair would be changed. Mice would be herded into one pen and the tunnel blocked off. The dirty pen would be removed, and a new clean pen attached with fresh water and grain. The ad libitum control mice (n = 349) were on a non-irradiated diet (5LG6, or “5S84”, TestDiet, Purina) from weaning. The diet restricted (n = 335) mice received 2.2 g/day/mouse of ground non-irradiated diet via modified fish feeders that were programmed to dump the ground diet onto the floor of the cage between 6-7 pm after lights were off. Modified feeders were restocked every 7 days. Any grain left in the feeders after 7 days was dumped on the cage floor. Proper feeder performance was indicated by a weighted string that was wound around a screw when the feeders dumped food. Diet restriction began at 4 weeks of age after being microchipped. An additional 339 mice received non-irradiated diet until they were 16 months of age, whereupon they started on 5LG6 diet with 142 ppm encapsulated rapamycin (Rapamycin Holdings, actual concentration of rapamycin in diet is 14 ppm, TestDiet, Purina). Mice were maintained until they died naturally. The mouse room was on a 12/12 hour light/dark schedule from 6:00 am to 6:00 pm and kept at 73o +/- 2o F.
Svenson:
We obtained female DO mice from the Jackson Laboratory breeding colony at ~4 weeks of age. Mice were obtained in eight waves over the course of 1 year and enrolled by randomization to dietary intervention protocols. Mice were housed 8 per group in single large format pens. Interventions were implemented as described for the Harrison study with a few differences indicated here. The ad libitum fed control mice (n = 319) were on a 4% irradiated diet (5K52, aka “5KOG”, TestDiet, Purina). The diet restricted mice (n = 316) received 2.2 g/day/mouse of ground 4% irradiated diet. DR mice were fed at ~7am daily and food was placed directly onto the bottom of the pen by a technician. On Friday the DR mice received a triple feeding (6.6 g/mouse) and were fed again on Monday morning. A third group of mice (n = 317) were maintained on the ad libitum protocol until 16 months of age and were then switched to rapamycin diet as described above. Mice experienced minimal handling (monthly body weights and weekly pen changes) and were maintained until they died naturally. The mouse room was on a 12/12 hour light/dark schedule from 6:00 am to 6:00 pm and kept at 70o +/- 2o F.
For this study, DNA samples were collected for genotyping on the MUGA array, as described for other studies below. However, irregularities with sample labeling/handling made us question the integrity of our ID matches between mice and samples. We performed quality-control assessment by comparing genotype-predicted versus recorded coat colors across the mice (Silvers 2012) and confirmed extensive sample mismatches (data not shown). We therefore excluded these data from our genetic analyses.
Shock:
We obtained female (n = 244) and male (n = 240) DO mice from the Jackson Laboratory breeding colony at ~4 weeks of age. Mice were obtained in five waves from June 2011 through August 2012. Mice were housed in single-sex groups of 5 in standard ventilated duplex pens. All mice were fed ad libitum on 6% sterilized gain (5K52, aka “5KOG”, TestDiet, Purina). Mice experienced minimal handling (body weights and other non-invasive procedures). At 6, 12, and 18 months we obtained 3x100ul retroorbital blood draws, with 2 weeks recovery time between each. Mice were maintained until they died naturally. The mouse room was on a 12/12 hour light/dark schedule from 6:00 am to 6:00 pm and kept at 70o +/- 2o F.
All procedures used in these studies were reviewed and approved by the Jackson Laboratory Animal Care and Use Committee.
Genotyping
Genotypes for all studies reported here were obtained using the mouse universal genotyping array (MUGA) (Morgan et al. 2016). DNA was isolated from tail tips using standard methods and shipped to Neogen Genomics (Lincoln, NE, USA) for analysis. Samples were genotyped using the MUGA (Harrison), MegaMUGA (Shock), or GigaMUGA (DRiDO) genotyping arrays. Founder haplotypes were reconstructed using the R/qtl2 software and samples with call rates at or above 90% were retained for analysis. Genome coordinates were from mouse genome GCRm39 and gene locations were taken from the Mouse Genome Informatics databases (Blake et al. 2021).
Data analysis
Survival analysis:
We compared survival among each cohort of animals, as well as between experimental groups within studies. This was done by plotting Kaplan-Meier curves and by testing the equivalence of survival distributions among each cohort or experimental group using log-rank tests using overall tests (across cohorts and within each cohort) as well as pairwise comparisons between each experimental group and its respective within-study control group (ex: comparing rapamycin treatment to ad libitum within the Harrison study). p-values are reported with no correction for multiple comparisons, and are considered significant at p < 0.05. Median lifespan was estimated in each cohort as well as within each experimental group. The effects of dietary interventions and/or sex were estimated via Cox proportional hazards regression analysis, and are reported as hazard ratios with 95% confidence intervals. p-values are reported without correction for multiple comparisons and are considered significant at p < 0.05. Survival analysis was conducted using the “survival” (Therneau and Grambsch 2000; Therneau 2024) package in R and plotted via the “ggsurvfit” package (Sjoberg et al. 2024). Mortality doubling times and baseline hazards were estimated, beginning at the time of intervention, from a Gompertz log-linear hazard model with a 95% confidence interval and percentage change relative to female mice on an ad libitum diet via the “flexsurv” package in R (Jackson 2016).
Additive whole-genome scans:
All genetic analysis was conducted using the “rqtl2” package in R (R Core Team 2024; Broman et al. 2019). Whole-genome scans for lifespan QTL were carried out via the scan1() function using a mixed effects model in which lifespan was regressed on 8-state allele probabilities for each individual in a dataset. Within the Dietary Restriction and Harrison studies, dietary intervention and DO generation were included as additive covariates. In the Shock study, Sex and DO generation were included as additive covariates. In the meta-analysis, Study, Diet, Sex, and DO generation were included as additive covariates. For each genome-wide scan, 1000 permutations of the data were performed in which phenotypes were randomized and a whole-genome scan was run. The maximum LOD score observed in each permuted scan was recorded, and the 95th percentile of the distribution of 1000 maximum LOD scores was used as the significance threshold (ɑ = 0.05). QTL with LOD scores greater than this threshold were considered significant at our permutation-based threshold. In addition to this significance level, a significance level of LOD ≥ 6 was also used to identify loci contributing to variance in lifespan. While less conservative, this threshold is more stringent than a previously reported method (Wright et al. 2022). We report 2 LOD support intervals (‘2LOD SI’), corresponding to a 2 LOD drop around each peak position, about each peak marker identified in whole-genome scans.
Forward regression analysis:
In the meta-analysis, forward regression analysis was performed to account for the effects of genome-wide significant QTL when searching for additional loci influencing lifespan. This was done via the scan1() function using a mixed effects model including dietary intervention, sex, and DO generation as additive effects and kinship as a random effect. In addition to these covariates, previous QTL identified at a genome-wide significance level were included in the model as additive effects. QTL were encoded as numeric variables representing the genotype state at the marker with the highest LOD score as reported by association mapping. Only QTL reaching permutation-based significance thresholds were included in the model as additive covariates.
Effect size estimation and percent variance explained:
Best linear unbiased predictors (BLUPs) and corresponding 95% were computed for all QTL using the scan1blups() function in “rqtl2” using the additive covariates listed above. Phenotypic variance explained by each QTL was calculated using the following formula:
1 - 10-(2/n)*LOD
Where n is the number of samples in a particular dataset, and LOD corresponds to the LOD score of the peak marker at each QTL.
Variant association / fine mapping:
Fine mapping was conducted within a 2LOD drop of the peak position associated with each QTL. Fine mapping was performed using the scan1snps() function in “rqtl2” using the same additive covariates listed in the Additive whole-genome scans section above. Variant and gene SQLite datasets used in this analysis are available at the “rqtl2” user guide website: https://kbroman.org/qtl2/assets/vignettes/user_guide.html.
Single-QTL models for diet- and sex-specific loci:
Within individual studies, QTL were tested for interaction with experimental factors unique to those studies. In the Shock cohort, QTL were tested for interaction with sex, while in other cohorts QTL were tested for interactions with one or more of the dietary interventions. Interaction tests were not conducted genome-wide using “rqtl2”, since this package does not fit interactions as random effects. For each QTL, tests were run using the 8-state allele probabilities at peak positions identified in whole-genome scans using the fit1() function in “rqtl2”. To assess significance, two models were run: an additive model in which experimental factors (sex, dietary intervention) and DO generation were included as additive covariates and an interaction model that included the same additive covariates with an additional interaction term corresponding to the experimental factor being tested. The reported LOD is the LOD of the interaction model minus the LOD of the additive model. To establish significance, 1000 permutations were run at each QTL in which phenotypes were randomized before running the additive and interaction models. After ordering the 1000 resulting LOD scores, the 95th percentile was chosen as the significance threshold (ɑ = 0.05). Interactions between QTL and experimental conditions were considered significant if their LOD was greater than the ɑ = 0.05 threshold. Interaction effects were plotted as residual lifespan values as a function of sex after correcting for additive covariates.
Genome-wide scans for diet- and sex-specific loci:
Gene by environment mixed effects models (GxEMM) were run using the ‘do-qtl’ software package in Python version 3.8.16 (Wright et al. 2022). Study, sex, and diet were included in the model as fixed effects, while diet and kinship were supplied as random effects. P values were computed based on 1,000 permutations of the data. A significance threshold of p <= 1*10-4 was used to define loci as statistically significant (Wright et al. 2022).
