The implications of lineage-specific rates for divergence time estimation
Data files
Dec 02, 2019 version files 70.80 KB
-
ALT_inference_loop.Rev
149 B
-
ALT_simple_heterogeneity_script.Rev
2.48 KB
-
ALT_simulation_study_version_five.py
2.39 KB
-
ALT_three_taxon_tree.tre
26 B
-
COMP_dirichlet_heterogeneity.Rev
4.26 KB
-
COMP_inference_loop_dirichlet.Rev
202 B
-
COMP_inference_loop.Rev
181 B
-
COMP_simulation_loop.R
4.22 KB
-
COMP_simulation_study_version_five.py
3.44 KB
-
COMP_standard_simple_heterogeneity_script.Rev
2.39 KB
-
DOU_heterogeneity_script_rate_prior_variance_doubled.Rev
2.35 KB
-
DOU_inference_loop.Rev
248 B
-
DOU_simple_clock_script.Rev
1.96 KB
-
DOU_simple_heterogeneity_script.Rev
2.30 KB
-
DOU_simulation_loop.R
3.46 KB
-
DOU_simulation_study_version_five.py
2.95 KB
-
inference_loop.Rev
307 B
-
QUAD_heterogeneity_script_rate_prior_variance_doubled.Rev
2.35 KB
-
QUAD_inference_loop.Rev
248 B
-
QUAD_simple_clock_script.Rev
1.96 KB
-
QUAD_simple_heterogeneity_script.Rev
2.30 KB
-
QUAD_simulation_loop.R
3.46 KB
-
QUAD_simulation_study_version_five.py
2.76 KB
-
READ_ME.txt
651 B
-
simple_clock_script.Rev
1.59 KB
-
simple_heterogeneity_script.Rev
1.96 KB
-
simulate_sequences.R
2.20 KB
-
simulate_tree.R
562 B
-
simulation_study_version_five.py
2.24 KB
-
SISTER_GROUP_COMPARISON_OF_TREE.R
2.90 KB
-
SUMM_inference_loop.Rev
185 B
-
SUMM_simple_clock_script.Rev
1.96 KB
-
SUMM_simple_heterogeneity_script.Rev
2.30 KB
-
SUMM_simulation_loop.R
4.71 KB
-
SUMM_simulation_study_version_five.py
3.17 KB
Jan 08, 2020 version files 254.37 KB
Abstract
Rate variation adds considerable complexity to divergence time estimation in molecular phylogenies. Here, we evaluate the impact of lineage-specific rates—which we define as among-branch-rate-variation that acts consistently across the entire genome. We compare its impact to residual rates—defined as among-branch-rate-variation that shows a different pattern of rate variation at each sampled locus, and gene-specific rates—defined as variation in the average rate across all branches at each sampled locus. We show that lineage-specific rates lead to erroneous divergence time estimates, regardless of how many loci are sampled. Further, we show that stronger lineage-specific rates lead to increasing error. This contrasts to residual rates and gene-specific rates, where sampling more loci significantly reduces error. If divergence times are inferred in a Bayesian framework, we highlight that error caused by lineage-specific rates significantly reduces the probability that the 95% highest posterior density includes the correct value, and leads to sensitivity to the prior. Use of a more complex rate prior—which has recently been proposed to model rate variation more accurately—does not affect these conclusions. Finally, we show that the scale of lineage-specific rates used in our simulation experiments is comparable to that of an empirical data set for the angiosperm genus Ipomoea. Taken together, our findings demonstrate that lineage-specific rates cause error in divergence time estimates, and that this error is not overcome by analyzing genomic scale multilocus data sets.