Describing, understanding and predicting the spatial distribution of genetic diversity is a central issue in biological sciences. In river landscapes, it is generally predicted that neutral genetic diversity should increase downstream, but there have been few attempts to test and validate this assumption across taxonomic groups. Moreover, it is still unclear what are the evolutionary processes that may generate this apparent spatial pattern of diversity. Here, we quantitatively synthesized published results from diverse taxa living in river ecosystems, and we performed a meta-analysis to show that a downstream increase in intraspecific genetic diversity (DIGD) actually constitutes a general spatial pattern of biodiversity that is repeatable across taxa. We further demonstrated that DIGD was stronger for strictly waterborne dispersing than for overland dispersing species. However, for a restricted data set focusing on fishes, there was no evidence that DIGD was related to particular species traits. We then searched for general processes underlying DIGD by simulating genetic data in dendritic-like river systems. Simulations revealed that the three processes we considered (downstream-biased dispersal, increase in habitat availability downstream and upstream-directed colonization) might generate DIGD. Using random forest models, we identified from simulations a set of highly informative summary statistics allowing discriminating among the processes causing DIGD. Finally, combining these discriminant statistics and approximate Bayesian computations on a set of twelve empirical case studies, we hypothesized that DIGD were most likely due to the interaction of two of these three processes and that contrary to expectation, they were not solely caused by downstream-biased dispersal.

Simulated data from the gene-flow model

Parameter values and summary statistics for simulations generated under the gene-flow model

Data_gene-flow_model.txt

Simulated data from the habitat availability model

Parameter values and summary statistics for simulations generated under the habitat availability model

Data_habitat-availability_model.txt

Simulated data from the colonization model

Parameter values and summary statistics for simulations generated under the colonization model

Data_colonization_model.txt

Simulated data from the gene-flow / habitat model

Parameter values and summary statistics for simulations generated under the gene-flow / habitat model

Data_gene-flow-habitat_model.txt

Simulated data from the gene-flow / colonization model

Parameter values and summary statistics for simulations generated under the gene-flow / colonization model

Data_gene-flow-colonization_model.txt

Simulated data from the habitat / colonization model

Parameter values and summary statistics for simulations generated under the habitat / colonization model

Data_habitat-colonization_model.txt

Simulated data from the gene-flow / habitat / colonization model

Parameter values and summary statistics for simulations generated under the gene-flow / habitat / colonization model

Data_gene-flow-habitat-colonization_model.txt

Simulated data from the NULL model

Parameter values and summary statistics for simulations generated under the NULL model

Data_NULL_model.txt

Scripts and data for meta-analyses

Scripts and data we used for performing the meta-analyses

Scripts_meta-analyses_v2.zip

Scripts for simulating data under the eight models

This file contains the .est and .par input files used for simulating the genetic datasets used in this article. There are eight different couples of .est/.par files, each of them being associated to one of the eight models presented in the article. The event at 40,000 generations before present (i.e. all genes in the network were send back to an unique deme at this date, considering a backwards in time timeframe), was used to uniformize coalescence times across simulations and models. Please, check the readme file for additional guidance.

Scripts_par_est_MODELS.zip

Demes IDs, and equivalences article vs. scripts

(i) Figure representing the ID of demes in function of their spatial positioning in the network. This figure will help readers to assess to which deme corresponds each summary statistic provided in the simulated datasets shared in DRYAD. (ii) Table reporting the equivalences between the name of the parameters we used in the article and those we used in the .est and .par scripts we share in DRYAD (i.e., the scripts that were used to simulate genetic data with ABCSampler and SIMCOAL 2). We also report the description of the parameters and the prior parameter values we used.

Demes_and_parameter_equivalences.docx

Data from: Evolutionary processes driving spatial patterns of intra-specific genetic diversity in river ecosystems

Data files

Abstract

Simulated data from the gene-flow model

Simulated data from the habitat availability model

Simulated data from the colonization model

Simulated data from the gene-flow / habitat model

Simulated data from the gene-flow / colonization model

Simulated data from the habitat / colonization model

Simulated data from the gene-flow / habitat / colonization model

Simulated data from the NULL model

Scripts and data for meta-analyses

Scripts for simulating data under the eight models

Demes IDs, and equivalences article vs. scripts

Data from: Evolutionary processes driving spatial patterns of intra-specific genetic diversity in river ecosystems

Data files

Abstract

Usage notes

Simulated data from the gene-flow model

Simulated data from the habitat availability model

Simulated data from the colonization model

Simulated data from the gene-flow / habitat model

Simulated data from the gene-flow / colonization model

Simulated data from the habitat / colonization model

Simulated data from the gene-flow / habitat / colonization model

Simulated data from the NULL model

Scripts and data for meta-analyses

Scripts for simulating data under the eight models

Demes IDs, and equivalences article vs. scripts

Works referencing this dataset