Estimating the time since admixture from phased and unphased molecular data
Janzen, Thijs (2021), Estimating the time since admixture from phased and unphased molecular data, Dryad, Dataset, https://doi.org/10.5061/dryad.xwdbrv1c5
After admixture, recombination breaks down genomic blocks of contiguous ancestry. The breakdown of these blocks forms a new `molecular clock', that ticks at a much faster rate than the mutation clock, enabling accurate dating of admixture events in the recent past. However, existing theory on the break down of these blocks, or the accumulation of delineations between blocks, so called `junctions', has mostly been limited to using regularly spaced markers on phased data. Here, we present an extension to the theory of junctions using the Ancestral Recombination Graph that describes the expected number of junctions for any distribution of markers along the genome. Furthermore, we provide a new framework to infer the time since admixture using unphased data. We demonstrate both the phased and unphased methods on simulated data and show that our new extensions have improved accuracy with respect to previous methods, especially for smaller population sizes and more ancient admixture times. Lastly, we demonstrate the applicability of our method on three empirical datasets, including labcrosses of yeast (Saccharomyces cerevisae) and two case studies of hybridization in swordtail fish and Populus trees.
This contains all code used for the manuscript, including code to simulate underlying data, analyze empirical data, and also code to visualize the results and reproduce the figures from the main text. The code is organized per figure and each folder is named according to the associated figure. See associated Zenodo Related Works for code.
Each folder typically contains 3 files:
data.zip contains the underlying (often simulated) data
simulate.R contains the scripts used to generate data.zip
plot_figure.R contains the code to summarise the simulated data, and create the figure found in the main text.
In some cases, additional scripts can be found to analyze the data, in particular for the three empirical datasets.
The raw data used in the empirical data analysis was not included here, but can be found in the references used, or in the additional README files in the relevant folders.
Furthermore, the code for the junctions R package was amended as well, which can be installed either from CRAN using 'install.packages('junctions')', or installed from the local tar file included in this dryad repository.
For completeness, the Supplementary file available via de journal's website has been added here as well.