Data from: Amazonian rivers are leaky barriers to gene flow in forest understory birds
Data files
May 28, 2024 version files 66.77 KB
Abstract
Ever since Alfred Russel Wallace’s nineteenth-century observation that related terrestrial species are often separated on opposing riverbanks, major Amazonian rivers have been recognized as key drivers of speciation. However, rivers are dynamic entities whose widths and courses may vary through time. It thus remains unknown how effective rivers are at reducing gene flow and promoting speciation over long timescales. We fit demographic models to genomic sequence to reconstruct the history of gene flow in three pairs of avian taxa fully separated by different Amazonian rivers, and whose geographic ranges do not make contact in headwater regions. Models with gene flow were best fit, but still supported an initial period without any gene flow which ranged from 187,000 to over 959,000 years, suggesting that rivers are capable of initiating speciation through long stretches of allopatric divergence. Allopatry was followed by either bursts or prolonged episodes of gene flow that retarded genomic differentiation but did not homogenize populations. Our results support Amazonian rivers as key barriers that promoted speciation and the buildup of species richness, but they also suggest that river barriers are often leaky, with genomic divergence accumulating slowly due to episodes of substantial gene flow.
README: Data from: Amazonian rivers are leaky barriers to gene flow in forest understory birds
https://doi.org/10.5061/dryad.z34tmpgnr
Our paper estimates the role of gene flow between three pairs of avian taxa separated by Amazonian rivers. For each taxon pair, we fit 15 demographic models to the site frequency spectrum using fastSIMCOAL2 v27 (Excoffier, et al. 2013). Here we include the Site Frequency Spectra files (MODEL_MSFS.obs), demographic model description files (MODEL.tpl) and parameter files (MODEL.est) required to run fastSIMCOAL2 for model testing and parameter estimation.
Description of the data and file structure
The zipped directory "Site_Frequency_Spectra.zip" contains three folders with the names of the three species analyzed. Inside each are two folders that contain the site frequency spectrum file (MODEL_MSFS.obs) for a two-population model (located in the directory “2POP”) and a three-population model (“3POP”). The SFS spectrum contains only the two focal sister taxa for the two-population model and in addition contains the sister species to these in the three-population model.
The zipped directory "FastSIMCOAL2_MODELS.zip" contains all the model files needed to define demographic models in fastSIMCOAL2. This directory contains two subdirectories: "Two_population_models" contains the two-population demographic models shown in Figure 3 of the paper, and "Three_population_models" contains the three-populations models used in Table S1. Each model is define by a file with .tlp extension which defines the models and a file with .est extension which defines the parameter values for the parameters in the .tlp file.
FastSIMCOAL2 populations in the .tlp files are numbered beginning with 0. For each of our three species complexes analyzed, population 0 refers to Dendrocincla fuliginosa rufoolivacea which is found south of the Amazonian river and east of the Tapajos / Teles Pires river system; an undescribed subspecies of Xiphorhynchus spixii found west of the Xingu River; and an undescribed subspecies of Willisornis vidua nigrigula found west of the lower Tapajos River. Population 1 refers to Dendrocincla fuliginosa fuliginosa which are found north of the Amazonian river; an undescribed subspecies of Xiphorhynchus spixii found east of the Xingu River; and an undescribed subspecies of Willisornis vidua nigrigula found east of the Tapajos River/Teles Pires river system. Population 2 refers to the sister species to populations 0 and 1 which are Dendrocincla [fuliginosa] atrirostris (which we consider to be a biological and phylogenetic species that is distinct from D. fuliginosa but which has not officially been split); Xiphorhynchus elegans; and Willisornis guttatus. Site frequency spectra have the same population order as the fastSIMCOAL2 files.
.tlp files have parameters as follows:
NPOP0_TIPS: Effective population size for population 0
NPOP1_TIPS: Effective population size for population 1
NPOP2_TIPS: Effective population size for population 2
ANCSIZE: Effective population size for population
M01A, M10A, M02A, M20A, M02A, M20A etc: Migration rate parameters between pairs of populations. For example, M01A represents per capita migration rate between population 0 and population 1 going backward in time (e.g. towards the point of coalescence). Conversely it represents the rate between 1 and 0 forward in time.
TIME3, TIME2 etc: these represent the timing (in generations from the present going back in time) of switches in migration rate matrixes.
TDIV, TDIV1, TDIV2: Divergence time. TDIV is the population divergence time between the taxa pairs in out two-population models, and TDIV2 in out three population models. TDIV1 is the divergence between the common ancestor of our taxon pairs and its sister species in our three population models.
.est files define sampling distributions for each of the model parameters in the .tlp file. For additional details see the manual for fastSIMCOAL2.
Code/Software
The command used to run fastsimcoal2 (Excoffier, et al. 2013) on each model is:
fasc27093 -t MODEL.tpl -e MODEL.est --msfs --multiSFS --maxlhood --brentol 0.001 -M -q -x -n 200000 --numloops 60 -c2 -B2
References
Excoffier, L., Dupanloup, I., Huerta-Sánchez, E., Sousa, V. C., & Foll, M. (2013). Robust demographic inference from genomic and SNP data. PLoS Genetics, 9(10), e1003905. doi: 10.1371/journal.pgen.1003905
Methods
These are model files used for fastSIMCOAL2.