Data from: Coevolution promotes the coexistence of Tasmanian devils and a fatal, transmissible cancer
Data files
Oct 25, 2024 version files 56.25 MB
-
devil.FOI.phenotype.Calculations.txt
6.57 KB
-
Output_Files.zip
56.22 MB
-
README.md
15.63 KB
Abstract
Emerging infectious diseases threaten natural populations, and data-driven modeling is critical for predicting population dynamics. Despite the importance of integrating ecology and evolution in models of host-pathogen dynamics, there are few wild populations for which long-term ecological datasets have been coupled with genome-scale data. Tasmanian devil (Sarcophilus harrisii) populations have declined range-wide due to devil facial tumor disease (DFTD), a fatal transmissible cancer. Although early ecological models predicted imminent devil extinction, diseased devil populations persist at low densities, and recent ecological models predict long-term devil persistence. Substantial evidence supports evolution of both devils and DFTD, suggesting coevolution may also influence continued devil persistence. Thus, we developed an individual-based, eco-evolutionary model of devil-DFTD coevolution parameterized with nearly two decades of devil demography, DFTD epidemiology, and genome-wide association studies. We characterized potential devil-DFTD coevolutionary outcomes and predicted the effects of coevolution on devil persistence and devil-DFTD coexistence. We found a high probability of devil persistence over 50 devil generations (100 years) and a higher likelihood of devil-DFTD coexistence, with greater devil recovery, than predicted by previous ecological models. These novel results add to growing evidence for long-term devil persistence and highlight the importance of eco-evolutionary modeling for emerging infectious diseases.
Input Data File / script
devil.FOI.phenotype.txt - Identical to devil_GEMMA_pheno.txt in the supplementary information of Gallinson et al. 2024. Intergenomic signatures of coevolution between Tasmanian devils and an infectious cancer. PNAS. Accessible at https://github.com/D-gallinson/Devil-DFTD-FOI-Coevolution.
Devil_FOI_Calculation.R
Description: Calculates upper and lower force of infection (FOI) bounds used in the third step of Parameter_Filtering_Submission.R.
Dependencies: devil.FOI.phenotype.txt
Output Files: None
Simulation scripts
DevilCoevolutionSims_1.0.tar.gz - C++ simulation code, converted into an R package. This package must be installed in R prior to running any of the other scripts.
Devil_Evolution_Functions_1.4.R - R code that contains all of the primary simulation functions and that loads the (previously installed) DevilCoevolutionSims package. This script is called in all subsequent scripts. Package Dependencies: Rcpp, DevilCoevolutionSims. Output Files: None
Parameter_Filtering_Submission_Fixed.R - Script for carrying out the parameter selection process. Dependencies: methods, matrixStats, MASS, lhs, parallel, randcorr. Output Files: parameters_round_2e-10.csv, Demo_Accept-10.csv, R2_Accept-10.csv, parameters_round_3e-10.csv.
Debugging Files: ErrorRecord-10.csv, ErrorReport-10.csv (Used only for debugging purposes and not included in data repository)
Evolutionary_Outcomes_Sims_Driver_Submission.R
Description: Runs simulations for coevolution in each single trait-pair along a 21 x 21 grid of initial devil genetic variance and DFTD mutation variance. Output files are denoted with the suffix X_grid-1, where X = 1 indicates coevolution in DFTD transmissibility and devil resistance to infection, X = 2 indicates coevolution in tumor growth rate and devil resistance to tumor growth, and X = 3 indicates coevolution in DFTD virulence and devil tolerance.
Dependencies: matrixStats, MASS, lhs, parallel, Devil_Evolution_Functions_Submission.R, parameters_round_2e-10.csv, Demo_Accept-10.csv, R2_Accept-10.csv
Output Files: PopExtinct1_grid-1.csv, PopExtinct2_grid-1.csv, PopExtinct3_grid-1.csv, TumorExtinct1_grid-1.csv, TumorExtinct2_grid-1.csv, TumorExtinct3_grid-1.csv, Coexist1_grid-1.csv, Coexist2_grid-1.csv, Coexist3_grid-1.csv, StopTimes1_grid-1.csv, StopTimes2_grid-1.csv, StopTimes3_grid-1.csv.
Evolutionary_Outcomes_Selected_Driver_Submission.R
Description: Runs 1000 simulations for each selected parameter combination given evolution in each single trait-pair . Output files are denoted with the suffix X_grid-s-1, where X = 1 indicates coevolution in DFTD transmissibility and devil resistance to infection, X = 2 indicates coevolution in tumor growth rate and devil resistance to tumor growth, and X = 3 indicates coevolution in DFTD virulence and devil tolerance.
Dependencies: matrixStats, MASS, lhs, parallel, Devil_Evolution_Functions_Submission.R, parameters_round_2e-10.csv, Demo_Accept-10.csv, R2_Accept-10.csv
Output Files: PopExtinct1_grid-s-1.csv, PopExtinct2_grid-s-1.csv, PopExtinct3_grid-s-1.csv, TumorExtinct1_grid-s-1.csv, TumorExtinct2_grid-s-1.csv, TumorExtinct3_grid-s-1.csv, Coexist1_grid-s-1.csv, Coexist2_grid-s-1.csv, Coexist3_grid-s-1.csv, StopTimes1_grid-s-1.csv, StopTimes2_grid-s-1.csv, StopTimes3_grid-s-1.csv
Evolutionary_Outcomes_Selected_Driver_MV_Submission.R
Description: Runs 1000 simulations for each selected parameter combination given no evolution (outputs with the suffix -nev), coevolution in all trait pairs (outputs with the suffix _Full-s), or coevolution in each combination of two trait-pairs (outputs with the suffix X-loo, in the sense of ‘leave one out’ where X = 1, 2, or 3 denotes which trait-pair was not permitted to coevolve).
Dependencies: matrixStats, MASS, lhs, parallel, Devil_Evolution_Functions_Submission.R, parameters_round_2e-10.csv, Demo_Accept-10.csv, R2_Accept-10.csv
Outputs: PopExtinct_Full-s.csv, PopExtinct1-loo.csv, PopExtinct2-loo.csv, PopExtinct3-loo.csv, PopExtinct-nev.csv, TumorExtinct_Full-s.csv, TumorExtinct1-loo.csv, TumorExtinct2-loo.csv, TumorExtinct3-loo.csv, TumorExtinct-nev.csv, Coexist_Full-s.csv, Coexist1-loo.csv, Coexist2-loo.csv, Coexist3-loo.csv, Coexist-nev.csv, StopTimes_Full-s.csv, StopTimes1-loo.csv, StopTimes2-loo.csv, StopTimes3-loo.csv, StopTimes-nev.csv, FinalSize_Full-s.csv, FinalSize1-loo.csv, FinalSize2-loo.csv, FinalSize3-loo.csv, FinalSize-nev.csv, InitialSize_Full-s.csv, InitialSize1-loo.csv, InitialSize2-loo.csv, InitialSize3-loo.csv, IntialSize-nev.csv.
Evolutionary_Dynamic_Sims_Driver_Submission.R
Description: Runs 1000 simulations at the average selected parameters (i.e., each parameter is set to the average of its selected values) for coevolution in each single trait-pair (outputs with the suffix X_mid where X = 1, 2, or 3) and coevolution in all trait-pairs (outputs with the suffix _Full_mid). The simulation functions used in this script return additional output variables described in the Output Data Files section of this README document.
Dependencies: matrixStats, MASS, lhs, parallel, Devil_Evolution_Functions_Submission.R, parameters_round_2e-10.csv, Demo_Accept-10.csv, R2_Accept-10.csv
Outputs: PopExtinct_Full_mid.csv, TumorExtinct_Full_mid.csv, Coexist_Full_mid.csv, StopTimes_Full_mid.csv, PopMat_Full_mid.csv, PrevMat_Full_mid.csv, MuGeno_Full_mid.csv, SigmaGeno_Full_mid.csv, MuGenoT_Full_mid.csv, SigmaGenoT_Full_mid.csv, MuGenoN_Full_mid.csv, SigmaGenoN_Full_mid.csv, PopExtinct1_mid.csv, TumorExtinct1_mid.csv, Coexist1_mid.csv, StopTimes1_mid.csv, PopMat1_mid.csv, PrevMat1_mid.csv, MuGeno1_mid.csv, SigmaGeno1_mid.csv, MuGenoT1_mid.csv, SigmaGenoT1_mid.csv, MuGenoN1_mid.csv, SigmaGenoN1_mid.csv, PopExtinct2_mid.csv, TumorExtinct2_mid.csv, Coexist2_mid.csv, StopTimes2_mid.csv, PopMat2_mid.csv, PrevMat2_mid.csv, MuGeno2_mid.csv, SigmaGeno2_mid.csv, MuGenoT2_mid.csv, SigmaGenoT2_mid.csv, MuGenoN2_mid.csv, SigmaGenoN2_mid.csv, PopExtinct3_mid.csv, TumorExtinct3_mid.csv, Coexist3_mid.csv, StopTimes3_mid.csv, PopMat3_mid.csv, PrevMat3_mid.csv, MuGeno3_mid.csv, SigmaGeno3_mid.csv, MuGenoT3_mid.csv, SigmaGenoT3_mid.csv, MuGenoN3_mid.csv, SigmaGenoN3_mid.csv.
Results scripts
Method_Figures_Submission_Revisions.R - Generates the plots in panel C of figure 1 (all .tiff outputs) and generates figure 2 of the main manuscript (Methods-Interpretation-mod.eps).
Dependencies: matrixStats, MASS, lhs, parallel, ggplot2, reshape2, directlabels, Devil_Evolution_Functions_Submission.R, parameters_round_2e-10.csv, Demo_Accept-10.csv, R2_Accept-10.csv
Outputs: Methods1-Tumor_Growth.tiff, Methods1-Survival.tiff, Methods1-Infection.tiff (insets for Fig 1), Methods-Interpretation-mod.eps (Fig 2)
Combined-Results-Code-Final.R - Generates all results figures and supplementary figures.
Dependencies: matrixStats, MASS, lhs, parallel, RColorBrewer, lattice, gridExtra, latticeExtra, ggplot2, reshape2, patchwork, cowplot, ggdist, ppcor, Devil_Evolution_Functions_Submission.R, all output files.
Outputs: Results-Contour_Probs_Combined.eps (Fig 3), Results-Contour_Times_Combined.eps (Fig S5), Results-Contour_Selected.eps (Fig 4), Results-Contour_Selected_S8.eps (Fig S8), Results-Violin_loo.eps (Fig 6), Results-Violin_single.eps (Fig S9), Results-Violin_recovery.eps (Fig 7), Results-Scatter_recovery.eps (Fig S10), Results-Dynamics.tiff (Fig 5), Results-Dynamics_S1.tiff (Fig S6), Results-Dynamics_S2.tiff (Fig S7), Supplement-Hist-1.eps (Fig S1), Supplement-Hist-2.eps (Fig S2), Supplement-Hist-3.eps (Fig S3), Supplement-Hist-4.eps (Fig S4).
Data Files
parameters_round_2e-10.csv - Selected parameter values after “round 2” of parameter selection. These parameter combinations meet all selection criteria except those that require full population-level simulations. Each column represents the values of a given parameter, while each row give a single set of all parameter values.
The parameters are, in order, 1. Subadult excess death rate (per week), 2. Offspring survival probability, 3. Female probability of mating, 4. Density-dependent death rate (per week per individual), 5. Density-independent death rate (per week), 6. Initial tumor growth rate (unitless), 7. tumor growth heterogeneity (unitless), 8. Maximum tumor load (cm^3), 9. Tumor latent period (weeks), 10. Baseline critical tumor size for mortality (relative to maximum tumor load, unitless), 11. Tumor mortality shape parameter (unitless), 12. Minimum survival probability (unitless), 13. Maximum transmission rate (per individual), 14. Critical tumor size for transmission (relative to maximum tumor load, unitless), 15. Transmission shape parameter (unitless), 16. Initial devil genetic variance in resistance to infection (this and all subsequent variances and covariances are unitless), 17. Initial devil genetic variance in resistance to tumor growth, 18. Initial devil genetic variance in tolerance. 19. Environmental variance in resistance to infection, 20. Environmental variance in resistance to tumor growth, 21. Environmental variance in tolerance, 22. Tumor mutation variance in infectivity, 23. Tumor mutation variance in growth rate, 24. Tumor mutation variance in virulence, 25. Initial devil genetic correlation between resistance to infection and to tumor growth, 26. Initial devil genetic correlation between resistance to infection and tolerance, 27. Initial devil genetic correlation between resistance to tumor growth and tolerance, 28. Tumor mutation correlation between infectivity and growth rate, 29. Tumor mutation correlation between infectivity and virulence, 30. Tumor mutation correlation between growth rate and virulence, 31. Subadult resistance factor (unitless).
Demo_Accept-10.csv - Each row corresponds to a single parameter combination. Column 1 gives the average proportional decrease in devil population size from years 5-15 over 1000 simulations. Column 2 gives the average DFTD prevalence from years 5-15 over 1000 simulations, Column 3 gives 25th percentile of time to infection for devils born in years 5-15 averaged over 1000 simulations, Column 4 gives 50th percentile of time to infection (in weeks) for devils born in years 5-15 averaged over 1000 simulations, Column 5 gives 75th percentile of time to infection (in weeks) for devils born in years 5-15 averaged over 1000 simulations. NAs are given in any instance where a quantity was unable to be computed for any of the 1000 simulations.
R2_Accept-10.csv - Each row corresponds to a single parameter combination. Column 1 gives the average proportion of variance in time to infection explained by devil genotype for infected devils born in years 5-15 over 1000 simulations. Column 2 gives the average proportion of variance in time to infection explained by DFTD genotype for infected devils born in years 5-15 over 1000 simulations, Column 3 gives the average proportion of variance in time to infection explained by the interaction between devil and DFTD genotype for infected devils born in years 5-15 over 1000 simulations, Column 4 gives the average proportion of variance in time from infection to death explained by devil genotype for infected devils born in years 5-15 over 1000 simulations, Column 5 gives the average proportion of variance in infection probability (case-control) for devils born in years 5-15 over 1000 simulations. Column 6 gives the average proportion of variance in time to infection explained by devil genotype, for infected devils born in years 5-15 over 1000 simulations. NAs are given in any instance where a quantity was unable to be computed for any of the 1000 simulations.
parameters_round_3e-10.csv - Parameter combinations meet all selection criteria. Each column represents the values of a given parameter, while each row give a single set of all parameter values.
Rather than describing each remaining data file individually, we describe categories of data files that have similar types of data. In general, data File naming follows the following convention: 1. A text name denoting the type of data contained in the file, 2. an appended number and text string denoting which devil-DFTD trait-pairs coevolved in the simulations that produced the data file, as well as the specific script that generated the data. Item 2 is elaborated on in the description of each script. The values for item 1 are described below.
The first class of data file are those that record a single value for each simulation x parameter combination. Each column represents a different parameter combination and each row within a column represents a single simulation.
Coexist - Each entry is either TRUE, indicating that both devils and DFTD survived to the end of the simulation, or FALSE, indicating that either devils or DFTD became extirpated during the simulation.
PopExtinct - Each entry is either TRUE, indicating that the devil population survived to the end of the simulation, or FALSE, indicating that devils were extirpated during the simulation.
StopTimes - Each entry gives the time (in weeks) at which either devils or DFTD became extirpated and the simulation halted. NA if devils and DFTD coexisted to the end of the simulation.
TumorExtinct - Each entry is either TRUE, indicating that DFTD survived to the end of the simulation, or FALSE, indicating that DFTD was extirpated during the simulation.
FinalSize - Each entry gives the final devil population size at the end of the simulation. NA is given if the devil population was extirpated before the end of the simulation.
InitialSize - Each entry gives the devil population size at the start of the simulation.
The second class of data record population variables of time for 1000 simulations at a single parameter combination. Each column represents a different simulation and each row within a column represents a single time point within that simulation. The output variables were recorded every year (52 time steps) rather than every time step to avoid producing very large output files. If devil or DFTD extirpation occurs in a given simulation, NAs are recorded for all time points following extirpation (except in “PopMat” files, which continue to record devil population size).
Note: The script that produces these outputs (Evolutionary_Dynamic_Sims_Driver_1.1.R) runs some of the simulations at multiple parameter values, resulting in data files whose rows are a multiple of 1000. The first parameter combination represents the mean of all selected parameters and corresponds to the data used in the main manuscript. The simulations at the extra parameter combinations are part of a robustness check that was never implemented. These extraneous results are discarded in the Combined-Results-Code-Cleaned.R script that generates the main results.
PopMat - Each entry gives the population size.
PrevMat - Each entry gives the DFTD prevalence rate.
MuGeno - Each entry gives the mean genotype for a given devil trait.
MuGenoN - Each entry gives the mean net phenotype for a given devil-DFTD trait pair. Net phenotype is defined as devil phenotype - DFTD phenotype and is defined only for infected devils.
MuGenoT - Each entry gives the mean genotype for a given DFTD trait.
SigmaGeno - Each entry gives the genotypic variance for a given devil trait.
SigmaGenoN - Each entry gives the variance in net phenotype for a given devil-DFTD trait pair.
SigmaGenoT - Each entry gives the genotypic variance for a given DFTD trait.
Latent genotypes and phenotypes are unitless.
This dataset contains the R and C++ code for the individual-based model used in the study "Coevolution promotes the coexistence of Tasmanian devils and a fatal, transmissible cancer". It also contains the R scripts for parameterizing the model and running the model in each analysis used in the study, the output files for all of these scripts, and the R script for generating the figures used in the study. See the README for further details.