Skip to main content

Data from: The genomic view of diversification

Cite this dataset

Marin, Julie; Achaz, Guillaume; Crombach, Anton; Lambert, Amaury (2020). Data from: The genomic view of diversification [Dataset]. Dryad.


The process of species diversification is traditionally summarized by a tree, the species tree, whose reconstruction from molecular data is hindered by frequent conflicts between gene genealogies. Here, we argue that instead of seeing these conflicts as nuisances, we can exploit them to inform the diversification process itself. We adopt a gene-based view of diversification to model the ubiquitous presence of gene flow among diverging lineages, one of the most important processes explaining disagreements among gene trees. We propose a new framework for modeling the joint evolution of gene and species lineages relaxing the hierarchy between the species tree and gene trees inherent to the standard view, as embodied in a popular model known as the multi-species coalescent (MSC). We implement this framework in two alternative models called the gene-based diversification models (GBD): 1) GBD-forward following all evolving genomes through time and 2) GBD-backward based on coalescent theory. They feature four parameters tuning colonization, gene flow, genetic drift and genetic differentiation. We propose a quick inference method based on differences between gene trees. Applied to two empirical data-sets prone to gene flow, we find better support for the GBD-backward model than for the MSC model. Along with the increasing awareness of the extent of gene flow, this work shows the importance of considering the richer signal contained in genomic histories, rather than in the mere species tree, to better apprehend the complex evolutionary history of species.

Usage notes

Instructions to use the GBD-backward and GBD-forward models are given in the file.