Data for: Fast sampling of protein conformational dynamics
Data files
Mar 24, 2026 version files 1.35 GB
-
Figure-1_S5_FRESEAN-mode-correlations.tar.gz
5.76 MB
-
Figure-2_visualize-modes7_8.tar.gz
1.58 MB
-
Figure-3A_visualize-geometric-CVs.tar.gz
272.15 KB
-
Figure-3B_closed-open.tar.gz
511.74 KB
-
Figure-4_boxplots.tar.gz
165.47 MB
-
Figure-5_S14_average-free-energy-surfaces.tar.gz
255.49 MB
-
Figure-6_average-geometric-free-energy-surfaces.tar.gz
255.41 MB
-
Figure-S10_boxplots-extended.tar.gz
161.14 MB
-
Figure-S11_S12_replica-free-energy-surfaces.tar.gz
63.78 MB
-
Figure-S13_min-free-energy-path.tar.gz
49.61 MB
-
Figure-S15_KRAS-unfolding.tar.gz
50.28 MB
-
Figure-S2_VDoS_aa-vs-cg.tar.gz
33.97 MB
-
Figure-S4_VDoS-1D.tar.gz
300.61 MB
-
Figure-S6_S7_quasi-harmonic-mode-correlations.tar.gz
3.86 MB
-
Figure-S8_S9_principal-component-mode-correlations.tar.gz
596.02 KB
-
README.md
2.06 KB
-
Table-1_free-energy-information.tar.gz
46.74 KB
Abstract
Protein function often depends on dynamic transitions between conformations rather than just static structures. However, our current ability to characterize or predict such dynamics lags behind recent advances in protein structure prediction. Enhanced sampling methods can speed up molecular dynamics simulations to study protein conformational transitions, but require prior knowledge of key collective motions involved. Here, we demonstrate for a series of proteins of varying complexity that the required information is encoded in anharmonic low-frequency vibrations. Using recently developed methods, we show that this information can be easily extracted from short dynamics simulations without requiring prior knowledge. Combined with enhanced sampling, we correctly predict conformational transitions in all test proteins and generate highly reproducible free energy landscapes. This allows for the rapid generation of accurate protein conformational ensembles, which is critical to unravel the complex relationship between protein sequence, structure, and dynamics.
This repository contains data files generated using classical molecular dynamics and metadynamics simulations of five proteins:
- HEWL: hen egg-white lysozyme
- HIV-1_Pr: HIV-1 Protease
- MCL-1: myeloid cell leukemia 1
- RBP: ribose binding protein
- KRAS: Kirsten rat sarcoma virus
Description of the data and file structure
Each sub-directory contains all data and scripts required to generate a specific figure from the accompanying manuscript. Each sub-directory further contains its own README.md file, which describes the data structure, the nature of the data files, and a single Mathematica (version 12.0) notebook, which processes the data files and generates the manuscript figures. Further, the operations performed by the Mathematica notebook are explained. In some cases, additional supplementary files are provided as context for the data (as described in the corresponding README>md files).
Each sub-directory has been converted into a compressed tarball file (tar.gz). On a UNIX command line, each directory can be unpacked using the tar -xvzf filename.tar.gz command.
The following files are present in this repository:
- Figure-1_S5_FRESEAN-mode-correlations.tar.gz
- Figure-2_visualize-modes7_8.tar.gz
- Figure-3A_visualize-geometric-CVs.tar.gz
- Figure-3B_closed-open.tar.gz
- Figure-4_boxplots.tar.gz
- Figure-5_S14_average-free-energy-surfaces.tar.gz
- Figure-6_average-geometric-free-energy-surfaces.tar.gz
- Figure-S2_VDoS_aa-vs-cg.tar.gz
(only data for HEWL) - Figure-S4_VDoS-1D.tar.gz
- Figure-S6_S7_quasi-harmonic-mode-correlations.tar.gz
- Figure-S8_S9_principal-component-mode-correlations.tar.gz
(only data for HEWL) - Figure-S10_boxplots-extended.tar.gz
- Figure-S11_S12_replica-free-energy-surfaces.tar.gz
- Figure-S13_min-free-energy-path.tar.gz (only data for HEWL)
- Figure-S15_KRAS-unfolding.tar.gz
- Table-1_free-energy-information.tar.gz
The identity of the corresponding figures in the accompanying manuscript can be determined from the file names.
