Skip to main content
Dryad

Scalable Bayesian divergence time estimation with ratio transformations

Data files

Jun 02, 2023 version files 1.19 GB

Abstract

Divergence time estimation is crucial to provide temporal signals for dating biologically important events, from species divergence to viral transmissions in space and time. With the advent of high-throughput sequencing, recent Bayesian phylogenetic studies have analyzed hundreds to thousands of sequences. Such large-scale analyses challenge divergence time reconstruction by requiring inference on highly-correlated internal node heights that often become computationally infeasible. To overcome this limitation, we explore a ratio transformation that maps the original N - 1 internal node heights into a space of one height parameter and N - 2 ratio parameters. To make the analyses scalable, we develop a collection of linear-time algorithms to compute the gradient and Jacobian-associated terms of the log-likelihood with respect to these ratios. We then apply Hamiltonian Monte Carlo sampling with the ratio transform in a Bayesian framework to learn the divergence times in four pathogenic viruses (West Nile virus, rabies virus, Lassa virus and Ebola virus) and the coralline red algae.  Our method both resolves a mixing issue in the West Nile virus example and improves  inference efficiency by at least 5-fold for the Lassa and rabies virus examples as well  as for the algae example. Our method now also makes it computationally feasible to  incorporate mixed-effects molecular clock models for the Ebola virus example, confirms  the findings from the original study and reveals clearer multimodal distributions of the  divergence times of some clades of interest.