Title: The efficacy of selection may increase or decrease with selfing depending upon the recombination environment Authors: Shelley A. Sianta, Stephan Peischl, David A. Moeller, Yaniv Brandvain Contact: ssianta@umn.edu Biorxiv doi: https://doi.org/10.1101/2021.05.20.445016 **************************************************** Goal of study: This study investigates how selfing rate, recombination, and mutation interact in multilocus genome to affect the accumulation of deleterious mutations in populations. We used individual-based forward simulations in SLiM. Populations consisted of 10,000 diploid individuals, each of which had a 45Mb genome split into six chromosomes. Genomes experienced four "types" of mutations: (1) fully - partially recessive (h = 0, 0.1 or 0.25) mutations with moderate to severe fitness effects (s = 0.015, 0.3, 0.9), and (2-4) three additive mutation types (h =0.5) with fitness affects that roughly span the nearly neutral boundary (s = 0.0005, s = 0.00025, s = 0.00005). We also conducted a set of simulations with just the three additive mutation types to assess the effect that recessive variation has on the efficacy of selection against additive mutations. We also varied the mutation rate and the relative recombination rate (i.e., the per-base-pair recombination rate/per-base-pair mutation rate). We post-processed raw SLiM outputs in R with the "Calculate_HomHetLoad.R" and "Run_ CalcHomHetLoad.R" scripts to generate our main datafile "HomHetLoad.csv". The allele frequency spectra (afs_unfolded.csv) and neutral diversity (pi.csv) were calculated with SLiM TreeSequences in msprime (https://tskit.dev/msprime/docs/latest/intro.html) and tskit (https://tskit.dev/tskit/docs/stable/introduction.html#). **************************************************** **************************************************** Data Files: **************************************************** **************************************************** (numbers correspond to column descriptions) ***************************** + HomHetLoad.csv - summary of the prevalence and fitness effects of mutations in each simulation. Each row is the output from one simulation ***************************** Note: mutation types are labelled as m1, m2, m3 and m4: m1: h = c(0, 0.1, 0.25), s = c(0.015, 0.3, 0.9) m2: h= 0.5, s = 0.0005 m3: h = 0.5, s = 0.00025 m4: h = 0.5, s = 0.00005 (1) pop.size - size of the population in the simulation (2) selfing.rate - selfing rate of the population in the simulation (3) sim.type - whether simulation has both recessive and additive mutations or additive only mutations (4) U.per.type - the genome-wide mutation rate per mutation type (5) RRR - the relative recombination rate (6) m1.s - selection coefficient of the recessive mutation type (m1) (7) m1.h - dominance coefficient of the recessive mutation type (m1) (8) rep - replicate run of the simulation (9) m1_het.fitness - multiplicative fitness at loci heterozygous for m1 mutations (10) m2_het.fitness - multiplicative fitness at loci heterozygous for m2 mutations (11) m3_het.fitness - multiplicative fitness at loci heterozygous for m3 mutations (12) m4_het.fitness - multiplicative fitness at loci heterozygous for m4 mutations (13) m1_no.het.muts - number of loci that are heterozygous for m1 mutations (14) m2_no.het.muts - number of loci that are heterozygous for m2 mutations (15) m3_no.het.muts - number of loci that are heterozygous for m3 mutations (16) m4_no.het.muts - number of loci that are heterozygous for m4 mutations (17) m1_hom.fitness - multiplicative fitness at loci homozygous for m1 mutations (18) m2_hom.fitness - multiplicative fitness at loci homozygous for m2 mutations (19) m3_hom.fitness - multiplicative fitness at loci homozygous for m3 mutations (20) m4_hom.fitness - multiplicative fitness at loci homozygous for m4 mutations (21) m1_no.hom.muts - number of loci that are homozygous for m1 mutations (22) m2_no.hom.muts - number of loci that are homozygous for m2 mutations (23) m3_no.hom.muts - number of loci that are homozygous for m3 mutations (24) m4_no.hom.muts - number of loci that are homozygous for m4 mutations (25) m1_homAndFixed.fitness - multiplicative fitness at loci homozygous for m1 mutations, including fixed mutations (26) m2_homAndFixed.fitness - multiplicative fitness at loci homozygous for m2 mutations, including fixed mutations (27) m3_homAndFixed.fitness - multiplicative fitness at loci homozygous for m3 mutations, including fixed mutations (28) m4_homAndFixed.fitness - multiplicative fitness at loci homozygous for m4 mutations, including fixed mutations (29) m1_no.homAndFixed.muts - number of loci that are homozygous for m1 mutations, including fixed mutations (30) m2_no.homAndFixed.muts - number of loci that are homozygous for m2 mutations, including fixed mutations (31) m3_no.homAndFixed.muts - number of loci that are homozygous for m3 mutations, including fixed mutations (32) m4_no.homAndFixed.muts - number of loci that are homozygous for m4 mutations, including fixed mutations (33) total.fitness.het.hom - multiplicative fitness across all loci (34) total.fitness.het.homAndFixed - multiplicative fitness across all loci, including fixed mutations (35) m1_prev.het.hom - average number of m1 mutations per individual (36) m2_prev.het.hom - average number of m2 mutations per individual (37) m3_prev.het.hom - average number of m3 mutations per individual (38) m4_prev.het.hom - average number of m4 mutations per individual (39) m1_prev.het.homAndFixed - average number of m1 mutations per individual (40) m2_prev.het.homAndFixed - average number of m2 mutations per individual (41) m3_prev.het.homAndFixed - average number of m3 mutations per individual (42) m4_prev.het.homAndFixed - average number of m4 mutations per individual (43) function.time.taken.min - how much time it took to process the raw output in R (44) m1_n.muts.poly - number of independent m1 mutations that are segregating in the population (45) m1_avg.freq.poly - average frequency of independent m1 mutations that are segregatng in the population (46) m1_n.muts.fixed - number of independent m1 mutations that are fixed in the population (47) m2_n.muts.poly - number of independent m2 mutations that are segregating in the population (48) m2_avg.freq.poly - average frequency of independent m2 mutations that are segregatng in the population (49) m2_n.muts.fixed - number of independent m2 mutations that are fixed in the population (50) m3_n.muts.poly - number of independent m3 mutations that are segregating in the population (51) m3_avg.freq.poly - average frequency of independent m3 mutations that are segregatng in the population (52) m3_n.muts.fixed - number of independent m3 mutations that are fixed in the population (53) m4_n.muts.poly - number of independent m4 mutations that are segregating in the population (54) m4_avg.freq.poly - average frequency of independent m4 mutations that are segregatng in the population (55) m4_n.muts.fixed - number of independent m4 mutations that are fixed in the population ***************************** + pi.csv ***************************** Each row is the value of pi from one simulation (1) file - the file name, which contains the simulation parameter values. (2) pi - value of pi for that simulation ***************************** + afs_unfolded.csv ***************************** Each row is the unfolded allele frequency spectrum from one simulation The R script, SelfingNe_ms_final.R puts the specific column names on this dataframe (1) file (V1 in csv) - the file name, which contains the simulation parameter values (2) toss (V2 in csv) - zeroth entry of the AFS, which counts alleles or branches not seen in the samples but that are polymorphic among the rest of the samples of the tree sequence (3-202) X1 - X200 (V3-V202 in csv) - counts of mutations that fall in different allele frequency classes, from singletons (X1) to mutations that are fixed (X200). ***************************** + Haplotypes/Full_outputs/ ***************************** These four files are the raw SLiM outputs used to make the haplotype clustering diagrams in "SelfingNe_ms_final.R" **************************************************** **************************************************** Scripts **************************************************** **************************************************** + SelfingNe_ms_final.R: inputs datafiles and used to make all figures/calculate statistics in the manuscript + Calculate_HomHetLoad.R: R function made to process raw SLiM simulation output + Run_Calc_HomHetLoad.R: R script that runs "Calculate_HomHetLoad.R" across all simulations. + SLiM_scripts: all SLiM scripts, including random seeds, used to run the simulations in this study