Folding-unfolding asymmetry and a RetroFold computational algorithm
Data files
Dec 19, 2022 version files 5.64 KB
-
folding_data.csv
-
README.md
-
retrofold_data.data.csv
-
stability_data.csv
-
trp-cage.fasta
-
unfolding_data.csv
Abstract
We treat protein folding as the molecular self-assembly, while unfolding is viewed as disassembly. Self-assembly and disassembly (fracture) are two opposite non-equilibrium dynamic processes; however, they cannot be converted to each other by a simple time variable reversal. Fracture is typically a much faster process than self-assembly. Self-assembly is often an exponentially decaying process, since energy relaxes due to dissipation, while fracture may be a constant rate process as the driving force is opposed by damping. Typically, protein folding takes two orders of magnitude longer time than unfolding, and it consumes a lot of computational resources to model folding. Based on energy dissipation rates, we suggest a mathematical transformation of variables, which makes it possible to view self-assembly as time-reversed disassembly, thus folding can be studied as reversed unfolding. We investigate the molecular dynamics modeling of folding and unfolding of the short Trp-cage protein. Folding time constitutes about 800 ns while unfolding (denaturation) takes only about 5.0 ns, and therefore, fewer computational resources are needed for its simulation. This “RetroFold” approach can be used for the design of a novel computation algorithm, which, while approximate, is less time-consuming than traditional folding algorithms.
Methods
We conducted the molecular dynamics (MD) simulation of both folding and unfolding using the Trp-Cage amino acid sequence with an extended initial conformation built by the LEaP module of AMBER. The linear conformation of this protein was designed using the Avogadro software. The 3D molecular structure (PDB ID: 1L2Y) of Trip-Cage was determined by the Nuclear Magnetic Resonance (NMR) method in the solution as a set (n =38) of stable conformations obtained from the RCSB Protein Data Bank. The MD simulations included the following phases: minimization (500 cycles), heating (50 ps), and equilibration (production) at 325 K (800 ns) for folding and at 473 K (5 ns) for unfolding according to the standard protocols published elsewhere. Our MD simulations were fully unrestrained and carried out in the canonical ensemble using the SANDER module available for Linux/Unix. The Berendsen thermostat was implemented for temperature control and the SHAKE algorithm to constrain the length of covalent bonds, including the hydrogen atoms. The ff99 force field was used as it was previously employed for similar modeling. Solvation effects were incorporated using the Generalized Born model, as implemented in AMBER. The Rosetta crystallographic refinement protocol was implemented to assess the conformational stability of the folded and unfolded protein conformations obtained from the previous MD simulations.
Usage notes
AMBER software