Main model fits and substitution rate predictions for: A quantitative genetic model of background selection in humans

Published Jan 29, 2024 on Dryad. https://doi.org/10.5061/dryad.qnk98sfnv

Abstract

Across the human genome, there are large-scale fluctuations in genetic diversity caused by the indirect effects of selection. This can be thought of as a “linked selection signal" that reflects the impact of selection varying according to the placement of functional regions and recombination rates along the genome. Previous work has shown that negative selection against the steady influx of new deleterious mutations into conserved regions is the predominant mode of selection in humans. However, the theoretic model that underpins these results, classic Background Selection theory, is only applicable when new mutations are so deleterious that they cannot fix in the population. Here, we develop a statistical method based on a quantitative genetics view of the linked selection, which models the effects of weak draft created according to how polygenic additive fitness variance is distributed along the genome. We use a recent model that jointly predicts the equilibrium fitness variance and substitution rates due to both strong and weakly deleterious mutations, we estimate the distribution of fitness effects (DFE) and mutation rate across three human populations. While our model can accommodate weaker selection, we initially find evidence across three human populations of very strong selection against deleterious mutations consistent with previous work. However, the corollary predicted substitution rates for conserved regions are unreasonably low, and in disagreement with observed rates. We hypothesize this could be due to selected sites experiencing a further diminished population size due to selective interference. When we account for this in our method, we find evidence of weakly deleterious mutations in conserved regions which brings the predicted substitution rate into agreement with observations. However, these models lead to implausibly large mutation rate estimates. Overall, while our model of the genomic linked selection signal brings us a step towards uniting population and quantitative genetic selection models with the substitution process, our work suggests considerable uncertainty remains about the processes generating fitness variance in humans.

Main model fits and substitution rate predictions for: A quantitative genetic model of background selection in humans

Data files

Abstract

Usage Notes

Files

Model Fits

Files Produced by Sims

Files Produced by Notebooks

`main_fits.ipynb`

`diversity_data.ipynb`

`region_simulations.ipynb`

`substitution.ipynb`

`method_evaluation.ipynb`

Predicted B' maps for the CADD 6% model

Main model fits and substitution rate predictions for: A quantitative genetic model of background selection in humans

Data files

Abstract

README: Main model fits and substitution rate predictions for: A quantitative genetic model of background selection in humans

Usage Notes

Files

Model Fits

Files Produced by Sims

Files Produced by Notebooks

main_fits.ipynb

diversity_data.ipynb

region_simulations.ipynb

substitution.ipynb

method_evaluation.ipynb

Predicted B' maps for the CADD 6% model

Methods

`main_fits.ipynb`

`diversity_data.ipynb`

`region_simulations.ipynb`

`substitution.ipynb`

`method_evaluation.ipynb`