Skip to main content

Data from: Sequence entropy of folding and the absolute rate of amino acid substitutions

Cite this dataset

Goldstein, Richard A.; Pollock, David D. (2018). Data from: Sequence entropy of folding and the absolute rate of amino acid substitutions [Dataset]. Dryad.


Adequate representations of protein evolution should consider how the acceptance of mutations depends on the sequence context in which they arise. However, epistatic interactions among sites in a protein result in hererogeneities in the substitution rate, both temporal and spatial, that are beyond the capabilities of current models. Here we use parallels between amino acid substitutions and chemical reaction kinetics to develop an improved theory of protein evolution. We constructed a mechanistic framework for modelling amino acid substitution rates that uses the formalisms of statistical mechanics, with principles of population genetics underlying the analysis. Theoretical analyses and computer simulations of proteins under purifying selection for thermodynamic stability show that substitution rates and the stabilization of resident amino acids (the ‘evolutionary Stokes shift’) can be predicted from biophysics and the effect of sequence entropy alone. Furthermore, we demonstrate that substitutions predominantly occur when epistatic interactions result in near neutrality; substitution rates are determined by how often epistasis results in such nearly neutral conditions. This theory provides a general framework for modelling protein sequence change under purifying selection, potentially explains patterns of convergence and mutation rates in real proteins that are incompatible with previous models, and provides a better null model for the detection of adaptive changes.

Usage notes