The role of historical biogeography in shaping colour morph diversity in the common wall lizard
Data files
Mar 26, 2024 version files 10.80 GB
-
bioclim.zip
-
bottleneck_output.csv
-
coordinates.csv
-
gen_diversities.csv
-
matrices.zip
-
Migration.csv
-
morph_composition.csv
-
PCA_Env_loadings.csv
-
pop_names.csv
-
Pyrenees.tif
-
rad1_ldpruned.dat
-
rad1_ldpruned.GT.FORMAT
-
rad1_ldpruned.INFO
-
rad1_ldpruned.recode.vcf
-
README.md
Abstract
The maintenance of polymorphisms often depends on multiple selective forces, but less is known on the role of stochastic or historical processes in maintaining variation. The common wall lizard (Podarcis muralis) is a colour polymorphic species in which local colour morph frequencies are thought to be modulated by natural and sexual selection. Here, we used genome-wide single-nucleotide polymorphism data to investigate the relationships between morph composition and population biogeography at a regional scale, by comparing morph composition with patterns of genetic variation of 54 populations sampled across the Pyrenees. We found that genetic divergence was explained by geographic distance but not by environmental features. Differences in morph composition were associated with genetic and environmental differentiation, as well as differences in sex ratio. Thus, variation in colour morph frequencies could have arisen via historical events and/or differences in the permeability to gene flow, possibly shaped by the complex topography and environment. In agreement with this hypothesis, colour morph diversity was positively correlated with genetic diversity, rates of gene flow and inversely correlated with the likelihood of the occurrence of bottlenecks. Concurrently, we did not find conclusive evidence for selection in the two colour loci. As an illustration of these effects, we observe that populations with higher proportions of the rarer yellow and yellow-orange morphs had higher genetic diversity. Our results suggest that processes involving a decay in overall genetic diversity, such as reduced gene flow and/or bottleneck events have an important role in shaping population-specific morph composition via non-selective processes.
README: The role of historical biogeography in shaping colour morph diversity in the common wall lizard
https://doi.org/10.5061/dryad.4xgxd25j0
Description of the data and file structure
The dataset associated to this publication includes the following files:
"R_scripts_historical_biogeography.zip"
(archived in Zenodo: https://doi.org/10.5281/zenodo.10810050This file includes six R scripts required to reproduce the results of this study:
Scripts:
1_Get_genetic_diversity_indices.R: This file includes the code to calculate the genetic diversity indices.
2_Obtain_distance_matrices.R: This file contains the code to produce the FST, topographic and environmental to be employed in further analyses. It also includes the code to obtain the distance matrices for sex ratio and morph composition indices.
3_MMRRs.R: Contains the code to carry out the multiple matrix regression with randomization (MMRR) analysis to test the effect of topographic, environmental and sex ratio distances on the FST. It also includes the code to test the effect of genetic, environmental, topographic and sex ratio distances on morph composition indices.
4_Diversity_correlations.R: Calculates and corrects the p-values of the correlations between genetic diversity and morph composition indices (Shannon index, Richness and Evenness index). It also includes the code to explore how the proportion of each colour morph varies across the Pyrenees.
5_Effect_of_migration_and_bottlenecks.R: Tests the effect of gene flow and recent bottleneck events on genetic diversity.
6_Tests_of_selection.R: Calculates a null distribution of FST values between populations. Then, genotypic frequences are obtained from the phenotypic frequencies (colour morph proportions) in order to calculate a simulated FST for the orange and yellow colour locus. These data is then employed to test whether orange and yellow locus are under selection.
"matrices.zip"
: Contains the matrices already calculated using the scripts described above."bioclim.zip"
: Includes the dataset with the bioclimatic data.pop_names.csv:
File with population names.coordinates.csv:
Coordinates of each of the sampled localities.gen_diversities.csv:
Genetic diversities of each locality.morph_composition.csv:
Contains the proportion of each colour morph, sex ratio, as well as indices of morph diversity. This dataset includes the following variables: Pop (population), lat and lon (latitude and longitude), sex ratio (SR), W, WO, O, Y and YO (number of white, white-orange, orange, yellow and yellow-orange individuals), followed by the same variables for males only (Wm, WOm, Om, Ym and YOm) and for females only (Wf, WOf, Of, Yf and YOf), the following columns correspond to the overall proportion of each colour morph (propW, propWO, propO, propY, propYO), the same proportions for males (propWm, propWOm, propOm, propYm, propYOm) and for females (propWf, propWOf, propOf, propYf, propYOf), followed by the richness (i.e., number of colour morphs) for all the sample (Riq), only for males (Riqm) and only for females (Riqf), and the same for the Shannon index (Sha, Sham and Shaf) and the Evenness index (Even, Evenm and Evenf).PCA_Env_loadings.csv:
PCA loadings from the bioclimatic variables, employed to calculate the environmental matrix.Migration.csv:
Information on the levels of gene flow between populations obtained via EEMS.bottleneck_output.csv:
Contains the p-values indicating whether populations have undergone recent bottleneck events (calculated using the software "BOTTLENECK").Pyrenees.tif:
Digital Elevation Model (DEM) raster required to calculate the topographic distances.rad1_ldpruned.dat:
FSTAT file used for BOTTLENECK software and for the tests of selection.rad1_ldpruned.GT.FORMAT:
File containing genotype information to obtain genetic diversity indices.rad1_ldpruned.INFO:
File containing allelic frequencies employed for the test of selection.rad1_ldpruned.recode.vcf:
VCF file required to calculate FST matrix.
To run the scripts all data files and scripts should be within the same folder. Files named matrices.zip
and bioclim.zip
can be calculated with the scripts provided, but as some of the scripts to obtain these data require some time to run, the already calculated matrices.zip
and bioclim.zip
outputs are also provided. To use them without calculating them with the scripts, please decompress them and move them to the folder containing the rest of data and scripts.
Samples from Angostrina (sample codes: DB30356, DB30362, DB30375, DB30378, DB30384, DB30392, DB30395, DB30403, DB304010 and DB30472) were produced in a previous study (Aguilar et al., 2022; https://doi.org/10.1111/jeb.13990) and can be downloaded from NCBI (Accession number: PRJNA741485). The rest of samples were generated in this study and are available at NCBI (Accession number: PRJNA1083934).
Methods
See the associated publication for details.