Skip to main content
Dryad

Population collapse in viviparid gastropods of the Lake Victoria ecoregion started before the Last Glacial Maximum

Cite this dataset

Van Bocxlaer, Bert et al. (2020). Population collapse in viviparid gastropods of the Lake Victoria ecoregion started before the Last Glacial Maximum [Dataset]. Dryad. https://doi.org/10.5061/dryad.q83bk3jg2

Abstract

For the purpose of reproducibility, we here provide the datasets and R script supporting the analyses of the paper “Population collapse in viviparid gastropods of the Lake Victoria ecoregion started before the Last Glacial Maximum” by Van Bocxlaer et al. This paper has been accepted for publication in Molecular Ecology on 31 July 2020. In this study, we examine the population structure of the clade of Bellamya gastropods that occupies the Lake Victoria ecoregion with the aim to relate past environmental change with demography and diversification dynamics. The here provided datasets include 1) an alignment of a fragment of the gene cytochrome c oxidase subunit 1 for 60 specimens; 2) genotype data for 321 individuals from 39 localities for 15 microsatellite loci (total dataset); 3) a regrouped genotype dataset (282 specimens from 21 localities for 15 microsatellite loci), which was used for some analyses in our study. Analyses were performed with various programs as reported in our paper. Here we provide input and result files (33 files in total) for these analyses, complemented with an R script that readily allows reproducing the majority of our inquiries.

Methods

Methodological information is provided in the material and methods section of to “Population collapse in viviparid gastropods of the Lake Victoria ecoregion started before the Last Glacial Maximum” by Van Bocxlaer et al. and accepted for publication in Molecular Ecology.

Usage notes

Here we briefly describe the 33 provided files. In what follows we refer to “Population collapse in viviparid gastropods of the Lake Victoria ecoregion started before the Last Glacial Maximum” by Van Bocxlaer et al. and accepted for publication in Molecular Ecology as “our paper”.

VanBocxlaer_et_al_Bellamya_LakeVictoriaEcoregion_Rscripts.R

This file contains all R scripts used to analyze the viviparid data with respect to the aims of our paper. It also contains full session information to maximize reproducibility.

Bellamya_COI_alignment.fas

This dataset contains the alignment for 60 specimens of a 593-basepair fragment of the gene cytochrome c oxidase subunit 1 that was used for network analysis as reported in our paper. The sequence codes provided in this file correspond to those reported in Table S1 of our paper. This table also links to NCBI GenBank accession numbers.

Bellamya_COI_labels.txt

This file links the sequence codes used in the abovementioned fasta file with population codes and information on waterbodies/genetic clusters as reported in our paper.

Bellamya_TotalDataset_RawGenotypeTable.txt

This file contains our total microsatellite dataset, i.e. genotype information for 321 individuals from 39 localities genotyped for 15 microsatellite loci, in simple tabular format.

Bellamya_TotalDataset_GENEPOP.txt

This file contains our total microsatellite dataset, i.e. genotype information for 321 individuals from 39 localities genotyped for 15 microsatellite loci, in GenePop format, which allows easy conversion to other formats.

Bellamya_RegroupedDataset_GENEPOP.txt

This file contains our regrouped microsatellite dataset, i.e. genotype information for 282 individuals from 21 localities genotyped for 15 microsatellite loci, in GenePop format, which allows easy conversion to other formats.

Bellamya_TotalData_GenePop_input.gen

Most analyses reported in our paper have been performed with 13 instead of 15 microsatellite loci, because two loci displayed substantial evidence for null alleles. This file contains our total microsatellite dataset excluding these two microsatellite loci (Bel_L50 and Bel_L45). The data are provided in GenePop format, which allows easy conversion to other formats.

Bellamya_TotalData_GenePop_input_II.gen

This file is identical to “Bellamya_TotalData_GenePop_input.gen”, however, with null alleles coded as ‘999’ instead of ‘000’.

Bellamya_TotalData_MorphoID.txt

This file contains morphological identifications for all specimens in our total microsatellite dataset (with rows in the same order as in Bellamya_TotalData_GenePop_input.gen). A more complete overview of these data is provided in Table S2 of our paper.

Bellamya_RegroupedData_GenePop_input.gen

Most analyses reported in our paper have been performed with 13 instead of 15 microsatellite loci, because two loci displayed substantial evidence for null alleles. This file contains our regrouped microsatellite dataset excluding these two microsatellite loci (Bel_L50 and Bel_L45). The data are provided in GenePop format, which allows easy conversion to other formats.

Bellamya_RegroupedData_GenePop_input_II.gen

This file is identical to “Bellamya_ RegroupedData_GenePop_input.gen”, however, with null alleles coded as ‘999’ instead of ‘000’.

Bellamya_RegroupedData_3POP_GenePop_input.gen

This file contains our regrouped microsatellite dataset (321 individuals from 39 localities genotyped for 13 microsatellite loci), now organized in 3 gene pools following our results of analyses of spatial patterns of genetic structure.

Bellamya_All_Loci_LD.txt

This file contains our data on pairwise linkage disequilibrium among loci. The associated p-values are not corrected yet for multiple tests (such a correction is performed in our R script).

Bellaymya_Fstat_Fst_13loci.txt

This file contains pairwise population FST values for our regrouped dataset (282 individuals from 21 localities genotyped for 13 microsatellite loci) in tabular format.

Bellamya_Pairwise_FST_pvalues.txt

This file contains pairwise population FST values as a list, with the associated p-values. The associated p-values are not corrected yet for multiple tests (such a correction is performed in our R script).

Bellamya_Pairwise_FST_pvalues_geodist.txt

This file contains pairwise population FST values as a list, with the associated p-values (corrected for the false discovery rate) and the geographic distance between the localities of each pairwise comparison.

Bellamya_LocalityData.txt

This file contains Table S5 of our manuscript together with some additional information for the purpose of preparing figures in R.

Bellamya_LVG_Fi_data.txt

This file contains data on individual inbreeding coefficients (Fi) for localities within the Lake Victoria gene pool with associated information on how many morphospecies were found at each locality (‘M’ for multiple, ‘S’ for single). This information has been extracted from Table S5 of our paper.

STRUCTURE_39pop_321ind_13loci_InputData.txt

This file contains our total microsatellite dataset in the format required by the software STRUCTURE.

Structure3Groups_assignementFile_Indivs_FINAL.txt

This file contains the assignment probabilities for each individual in our total microsatellite dataset, as obtained from STRUCTURE analysis with 3 genetic clusters.

SPAGeDi-RegroupedDataset_AllPops_GeneticInputData.txt

This file contains our regrouped microsatellite dataset (282 individuals from 21 localities, genotyped for 13 microsatellite loci) in the format required for analyses with the software SPAGeDi.

SPAGeDi-RegroupedDataset_AllPops_GeoDistances_indiv.txt

This file contains geographical distances for each pairwise comparison between two individuals in our regrouped microsatellite dataset (282 individuals from 21 localities, genotyped for 13 microsatellite loci), used by the software SPAGeDi.

SPAGeDi-RegroupedDataset_LVG_GeneticInputData.txt

This file contains the genotypic input dataset for analyses with the software SPAGeDi on the populations of the Lake Victoria gene pool (212 individuals from 16 localities, genotyped for 13 microsatellite loci).

SPAGeDi-RegroupedDataset_LVG_GeoDistances_indiv.txt

This file contains geographical distances for each pairwise comparison between two individuals in the Lake Victoria gene pool from our regrouped dataset (212 individuals from 16 localities, genotyped for 13 microsatellite loci), used by the software SPAGeDi.

Bellamya_SPAGeDi_Total_Output_GeoDist_13loci.txt

Output summary of all our analyses with the software SPAGeDi.

Bellamya_InputBayesass_migration_13loci.txt

This file contains the genetic input data for conversion to the input format required by the software BayesAss.

Bayesass_InputFile.txt

This file contains the genetic input data in the format required by the software BayesAss. Results of the BayesAss analyses are reported in Table S7 of our paper.

LVictoria_pop1_Final.mss

Input dataset for Bellamya gastropods of the Lake Victoria gene pool for analysis with the software DIYABC. Given that DIYABC treats null alleles as missing data all 15 microsatellite loci were included. This dataset is a subset compiled from the regrouped dataset.

LVictoria_pop1_Final_noADM.mss

Input dataset for Bellamya gastropods of the Lake Victoria gene pool for analysis with the software DIYABC, excluding individuals that show signs of admixture. Given that DIYABC treats null alleles as missing data all 15 microsatellite loci were included. This dataset is a subset compiled from the regrouped dataset.

VictoriaNile_pop1_Final.mss

Input dataset for Bellamya gastropods of the Lake Kyoga gene pool for analysis with the software DIYABC. Given that DIYABC treats null alleles as missing data all 15 microsatellite loci were included. This dataset is a subset compiled from the regrouped dataset.

LAlbertWhiteNile_pop1_Final.mss

Input dataset for Bellamya gastropods of the Lake Albert gene pool for analysis with the software DIYABC. Given that DIYABC treats null alleles as missing data all 15 microsatellite loci were included. This dataset is a subset compiled from the regrouped dataset.

LAlbertWhiteNile_pop1_Final_noADM.mss

Input dataset for Bellamya gastropods of the Lake Albert gene pool for analysis with the software DIYABC, excluding individuals that show signs of admixture. Given that DIYABC treats null alleles as missing data all 15 microsatellite loci were included. This dataset is a subset compiled from the regrouped dataset.

ABC_robustness_migration.txt

This file contains the results of our analysis as to how migration affects approximate Bayesian computations. Each row represents the results for a single simulated dataset (33,000 datasets in total), which are summarized in Figure S7 of our paper.

Funding

Agence Nationale de la Recherche, Award: ANR-JCJC-EVOLINK

Deutsche Forschungsgemeinschaft, Award: DFG AL 1076/5-2, DFG AL 1076/6-2

Conseil Régional des Hauts-de-France, Award: CPER-CLIMIBIO

Ministry of Higher Education and Research, Award: CPER-CLIMIBIO

European Funds for Regional Economic Development, Award: CPER-CLIMIBIO

European Funds for Regional Economic Development, Award: CPER-CLIMIBIO