Data from: Modeling multilocus selection in an individual-based, spatially-explicit landscape genetics framework

Data files

Nov 26, 2019 version files 543.36 MB

1LocusModel_v2Rescaled_Global_WFTriangleAA_1540583012_Figure4A.zip

9.06 MB
1LocusModel_v2Rescaled_Global_WFTriangleaa_1540583012_Figure4B.zip

9.14 MB
1LocusModel_v2Rescaled_Global_WFTriangleAa_1540583012Figure4C.zip

13.20 MB
1LocusModel_WFValidation_mean4_rescaledGlobal_1540505908_Figure3A.zip

218.42 MB
2LocusModel_WFValidation_mean4_rescaledGlobal_1540503873_Figure3B.zip

217.71 MB
MLocusModel_SelectionDispersalLevels_Figure6.zip

5.97 MB
MLocusModelX3_IBD1000_1540768392_Figure5.zip

69.85 MB

Abstract

We implemented multilocus selection in a spatially-explicit, individual-based framework that enables multivariate environmental gradients to drive selection in many loci as a new module for the landscape genetics programs, CDPOP and CDMetaPOP. Our module simulates multilocus selection using a linear additive model, providing a flexible platform to evaluate a wide range of genotype-environment associations. Importantly, the module allows simulation of selection in any number of loci under the influence of any number of environmental variables. We validated the module with individual-based selection simulations under Wright-Fisher assumptions (Figure 3 and data provided here). We then evaluated results for simulations under a simple landscape selection model (Figure 4 and data provided here). Next, we simulated individual-based multilocus selection across a complex selection landscape with three loci linked to three different environmental variables (Figure 5 and data provided here). Finally, we demonstrated how the program can be used to simulate multilocus selection under varying selection strengths across different levels of gene flow in a landscape genetics framework (Figure 6 and data provided here). This new module provides a valuable addition to the study of landscape genetics, allowing for explicit evaluation of the contributions and interactions between gene flow and selection-driven processes across complex, multivariate environmental and landscape conditions.

Simulated datasets are provided for Figures 3-6 of the manuscript.

Figure 3: The simulations for the single (A) and double (B) diallelic locus selection models. For the single diallelic locus selection model, we used F = X _₁ ([b _₁₁₁ A _₁₁₁ + b _₁₁₂ A _₁₁₂ ]), and set the average effects, b₁₁₁ = 10 and b₁₁₂ = -10. For the double diallelic locus selection model, we used F = X _₁ ([b _₁₁₁ A _₁₁₁ + b _₁₁₂ A _₁₁₂ ]+[ b _₁₂₁ A _₁₂₁ + b _₁₂₂ A _₁₂₂ ]), and set set b₁₁₁ = 10, b₁₁₂ = -10, b₁₂₁ = 10, and b₁₂₂ = -10. X₁ was a uniform spatial selection surface with all values of 1. Each simulated dataset contains 50 replicates. For both the single and double diallelic locus selection model simulations, the Wright-Fisher model was assumed (i.e., random mating, sexual reproduction with both female and male with replacement, offspring randomly disperse until a constant population is reached that has an equal sex ratio, no mutation, and non-overlapping generations) with one exception: each mated pair produced 2 offspring to ensure a constant population size. We simulated individual genetic exchange across 100 non-overlapping generations among 1000 randomly spatially located individuals in a 1024 x 1024 gridded landscape for each selection model. All simulated populations contained an additional 50 diallelic neutral loci.

Figure 4: Three simulation datasets of a single spatially-variable selection landscape and single diallelic locus. These datasets were produced with the same simulation parameters as in Figure 3's datasets, but replacing the uniform spatial seelction surface with a spatially-variable seelction surface. Using a 1024 x 1024 gridded raster, we created a categorical landscape that included an upper triangle with values of 1, a lower triangle with values of -1, and diagonal cells with a value of 0. The three simulated datasets varied how the genotypes are initialized by starting the simulations with (i) only AA, (ii) only aa, and (iii) random assignment. 50 replicates are included in each dataset.

Figure 5: Simulation dataset of a complex landscape and three loci. These data considered three environmental variables that affect fitness as shown in Figure 1 of the manuscrip, with three loci and two alleles per locus operating in the selection process. The first environmental variable was the previously described categorical landscape used for datasets in Figure 4.. The second environmental variable was a gradient landscape with continuous values ranging from 1 to -1 from the North-South. The third environmental variable represented a fragmented landscape with equal proportion of values for 1 (e.g., favored habitat) and -1 (e.g., non-favored habitat) created in the program QRULE. For simplicity, we set b₁₁₁ = 10 and b₁₁₂ = -10 for the first locus and environmental variable, X₁, b₂₂₁= 10 and b₂₂₂ = -10 for the second locus and environmental variable, X₂, and b₃₃₁= 10 and b₃₃₂ = -10 for the third locus and environmental variable, X₃, where X₁, X₂, and X₃ are the diagonal, gradient, and habitat variables shown in Figure 1, respectively. The remaining betas were set to 0. Unlike the previous panmictic movement simulations, we restricted movement of the individuals in these simulations to follow an inverse-square probability function constrained to a 25% maximum threshold of the entire landscape. We initialized all genotypes randomly at the start of the simulations. 3 replicates are included.

Figure 6: Multilocus selection simulations under three selection strengths (strong, moderate, and weak) and three dispersal scenarios (5%, 10%, and 15% of the landscape). These simulations include 1000 loci with two alleles per locus: 100 loci under selection in response to a single environmental variable and 900 neutral loci. We used a gradient landscape with continuous values ranging from 1 to -1 from the North-South. We set the first l = 1, 2, …, 20 loci effect sizes to b_1l1 = 0.15 and b_1l2 = -0.15, the following l = 21, 22, …, 50 loci effect sizes to b_1l1 = 0.10 and b_1l2 = -0.10, and the last l = 51, 52, …, 100 loci effect sizes to b_1l1 = 0.05 and b_1l2 = -0.05 (reflecting “strong” [n=20], “moderate” [n=30], and “weak” [n=50] selection, respectively). We increased the population size to 5000 for these simulations within the same previous 1024 x1024 simulation landscape. We restricted movement of the individuals in these final simulations to follow an inverse-square probability function constrained to 5%, 10%, and 15% maximum threshold of the entire landscape. We initialized all genotypes randomly at the start of the simulations and ran the simulations for 200 generations, using the first 100 generations as a ‘burn-in’ period where no selection was operating.

Data from: Modeling multilocus selection in an individual-based, spatially-explicit landscape genetics framework

Data files

Abstract

Methods

Usage notes

Works referencing this dataset