Data from: Bioinspired design rules for flipping across the lipid bilayer from systematic simulations of membrane protein segments
Data files
Jul 24, 2025 version files 195.35 GB
-
data.tar.gz
195.35 GB
-
README.md
8.84 KB
Abstract
The orientation of integral membrane proteins (IMPs) with respect to the membrane is established during protein synthesis and insertion into the membrane. After synthesis, IMP orientation is thought to be fixed due to the thermodynamic barrier for “flipping” protein loops or helices across the hydrophobic core of the membrane in a process analogous to lipid flip-flop. A notable exception is EmrE, a homodimeric IMP with an N-terminal transmembrane helix that can flip across the membrane until flipping is arrested upon dimerization. Understanding the features of the EmrE sequence that permit this unusual flipping behavior would be valuable for guiding the design of synthetic materials capable of translocating or flipping charged groups across lipid membranes. To elucidate the molecular mechanisms underlying flipping in EmrE and derive bioinspired design rules, we employ atomistic molecular dynamics simulations and enhanced sampling techniques to systematically investigate the flipping of truncated segments of EmrE. Our results demonstrate that a membrane-exposed charged glutamate residue at the center of the N-terminal helix lowers the energetic barrier for flipping (from ~12.1 kcal mol-1 to ~5.4 kcal mol-1) by stabilizing water defects and minimizing membrane perturbation. Comparative analysis reveals that the marginal hydrophobicity of this helix, rather than the marginal hydrophilicity of its loop, is the key determinant of flipping propensity. Our results further indicate that interhelical hydrogen bonding upon dimerization inhibits flipping. These findings establish several bioinspired design principles to govern flipping in related materials: (1) marginally hydrophobic helices with membrane-exposed charged groups promote flipping, (2) modulating protonation states of membrane-exposed groups tunes flipping efficiency, and (3) interhelical hydrogen bonding can be leveraged to arrest flipping. These insights provide a foundation for engineering synthetic peptides, engineered proteins, and biomimetic nanomaterials with controlled flipping or translocation behavior for applications in intracellular drug delivery and membrane protein design.
Author: ByungUk Park
Date: 7/22/2025
The following tarball file data.tar.gz contains all the data relevant to the simulations of truncated peptide systems introduced in this work (https://doi.org/10.1039/D5ME00032G) and scripts for the analysis of the data. First six folders ('1.Nloop', '2.Npep_E14charge', '3.Cpep', '4.Npep_E14neutral', '5.monomer_E14charge', '6.Hbond_analysis') contain input and output files for the simulations of different peptide systems (e.g., 'index.ndx', '.mdp', '.top', '.tpr', '.xtc', etc). Each of these folders follows same organization (detailed explanation in '1.Nloop' section). The remaining two folders ('misc', 'scripts') contain scripts used to analyze the simulation outputs. All simulations were run with Gromacs 2021.5 patched with PLUMED 2.8.
- Simulation-related files
- Some of them ('.gro', '.ndx', '.top', '.itp', '.mdp' and '.xvg') are text files that can be opened with typical text readers, while others ('.tpr', '.xtc') are binary files that require Gromacs software to read or run the files.
- '.tpr': Simulation input (coordinates, parameters, etc.)
- '.ndx': List of indices of atom/residue groups
- '.top': System topology and parameters
- '.itp': Modular components of a full topology (parameters, atom types, bonds, angles, dihedrals, etc.)
- '.mdp': Simulation control parameters (simulation time, temperature, pressure, etc.)
- '.gro': System structure (coordinates, atom type, box size)
- '.xtc': Compressed trajectory
- '.xvg': Plot/graph data generated by Gromacs built-in functions
- Software & package prerequisites (tested version in parenthesis): Gromacs (2021.5), PLUMED (2.8), VMD (1.9), Python (3.8), mdtraj (1.9.7), numpy (1.23.5), pandas (2.0.3)
1.Nloop
- This directory contains all simulation input and output files for the N-loop system, including '.gro', '.itp', '.mdp', and '.xtc' files. The folder is organized as follows:
- 1.sys_prep: contains initial structure files generated by CHARMM-GUI
- 2.smd: contains input and output files from steered molecular dynamics (SMD) in both (1) forwared (Flip) and (2) backward (Flop) pulling directions
- 3.reus: contains output files from replica exchange umbrella samplings (REUS) simulations
- 1.equilibration: output files for the equilibration step of REUS windows
- 2.production: output files for production step of REUS windows
- 4.us: present only in the '1.Nloop' directory, contains simple umbrella sampling (US) outputs generated from SMD pulling trajectories (both Flip, Flop)
- 1.equilibration: output files for equilibration step of US windows
- 2.production: output files for production step of US windows
- mdp: parameter files ('.mdp') used to generate '.tpr' files for simulations
- 'index.ndx', 'topol.top': index and topology files used to set up and run US simulations
- output_files: contains post-simulation analyses
- reus: weighted histogram analysis method (WHAM) outputs ('.xvg') from REUS using multiple sets of equilibration and production times
- solvent_density: '.csv' files reporting time-averaged number densities of water and phosphorus (P) atoms across windows. Columns correspond to harmonic restraint positions along the z-axis; rows report densities at specific z-distances from the membrane midplane averaged over time and lateral dimensions.
- trajectory_last_confs_concat: contains a concatenated trajectory file ('trajectory.xtc') of the final frame from each REUS window.
- us: (only in '1.Nloop' directory) WHAM results for US windows from both pulling directions:
- 1.forward_pull(Flip): results from US initialized from forward pulling SMD
- 2.backward_pull(Flop): results from US initialized from backward pulling SMD
- toppar: forcefield and topology parameters used for the simulations
2.Npep_E14charge
- This directory contains all input and output files associated with simulations of the Npep_E14charge system.
- The folder structure and file types follow the same organizational scheme as described for the '1.Nloop' directory. Only subdirectories and files not previously described are detailed below:
- output_files
- helicity: contains input and output files for analyzing time-resolved helicity of the Npep_E14charge peptide across window trajectories
- cv_dat: output files ('.dat') that report time-series helicity scores for each REUS window
- plumed_dat: input files ('.dat') used to compute helicity scores via PLUMED
- 'mcfile': configuration file used to execute the PLUMED helicity analysis
- 'reference.pdb': reference structure used for helicity score calculations
- helicity: contains input and output files for analyzing time-resolved helicity of the Npep_E14charge peptide across window trajectories
3.Cpep
- This directory contains all input and output files associated with simulations of the Cpep system.
- The folder structure and file types follow the same organizational scheme as described for the '1.Nloop' and '2.Npep_E14charge' directories.
4.Npep_E14neutral
- This directory contains all input and output files associated with simulations of the Npep_E14neutral system.
- The folder structure and file types follow the same organizational scheme as described for the '1.Nloop' and '2.Npep_E14charge' directories.
5.monomer_E14charge
- This directory contains all input and output files for simulations of N-terminal and C-terminal flipping in the EmrE monomer system.
- The folder structure and file types follow the same organizational scheme as described for the '1.Nloop' and '2.Npep_E14charge' directories. Only subdirectories and files not previously described are detailed below:
- c-term_flip: contains all input and output files for simulations of C-terminal flipping in the EmrE monomer
- output_files/h-bond: output files from interhelical hydrogen bond analysis across REUS windows. Subsequent analyses and visualizations are generated using the 'reus_hbond_anal.ipynb' notebook in the 'monomer_E14charge' directory.
- n-term_flip: contains all input and output files for simulations of N-terminal flipping in the EmrE monomer
- 'reus_hbond_anal.ipynb': jupyter notebook used to generate plots of average interhelical hydrogen bond counts per window, based on REUS trajectory data (e.g., '5.monomer_E14charge/n-term_flip/output_files/h-bond')
- c-term_flip: contains all input and output files for simulations of C-terminal flipping in the EmrE monomer
6.Hbond_analysis
- This directory contains results from hydrogen bond (H-bond) analyses between transmembrane helices (TMHs) in EmrE monomer and dimer systems
- input_files: contains simulation trajectories ('.xtc') and structure ('.gro') files for all systems evaluated in this study. File naming reflects:
- System type: 'emre_dimer' or 'emre_monomer'
- E14 protonation state: 'E14-charged' or 'E14-neutral'
- Simulation type: 'npt_500ns' (unbiased MD), 'smd_backward_pull_posres', or 'smd_forward_pull_posres' (steered MD in respective directions)
- output_files: contains results of interhelical H-bond analysis for each system
- For 'emre_dimer', H-bonds are categorized based on the helices involved:
- btwPROA-PROB: interhelical H-bonds between chain A and chain B
- PROA_helix_only: interhelical H-bonds within chain A
- PROB_helix_only: interhelical H-bonds within chain B
- For 'emre_dimer', H-bonds are categorized based on the helices involved:
- 'hbond_anal.ipynb': jupyter notebook for analyzing the time-averaged number of H-bonds between all helix pairs across different systems. Descriptions of key analysis parameters are provided within the code cells.
- input_files: contains simulation trajectories ('.xtc') and structure ('.gro') files for all systems evaluated in this study. File naming reflects:
misc
- This directory contains scripts used for supplementary analyses, primarily presented in the Supporting Information (SI)
- 1.emre_sequence-based_hydropathy
- 'hydropathy_plot.ipynb': jupyter notebook that generates a hydropathy plot for the EmrE sequence using Kyte-Doolittle hydropathy scores
- 2.system_lipid_number_check
-
Subdirectories in the format '###_DMPC': contain simulation files and analysis results for systems with varying numbers of DMPC lipids
-
'lipid_per_area.py': python script that generates summary ('.dat') and plot ('.eps') files of lipid number per area. To execute, set the 'root_dir' variable in the script to the target system directory and run:
$ python lipid_per_area.py
-
- 1.emre_sequence-based_hydropathy
- 3.helicity_analysis
- 'COLVAR_datfile_anal_helicity.ipynb': jupyter notebook that generates helicity profiles for systematically truncated peptide systems as a function of window index
scripts
- scripts for multiple analysis of each evaluated system
- density heatmap: contains scripts that generate number density heatmap of each system (for water & P atoms)
- Please check README in the directory for detailed guidelines
- reus: contains scripts that analyze the REUS results
- Please check README in the directory for detailed guidelines
- density heatmap: contains scripts that generate number density heatmap of each system (for water & P atoms)