Data from: The pathogenic E139D mutation stabilizes a non-canonical active state of the multi-domain phosphatase SHP2
Data files
Jul 18, 2025 version files 5.69 GB
-
README.md
3.46 KB
-
scripts.zip
5.85 KB
-
shp2_af2_E139D_1_2500f.pdb
1.72 GB
-
shp2_af2_E139D_1_250f.pdb
172.13 MB
-
shp2_af2_E139D_2_2500f.pdb
1.72 GB
-
shp2_af2_E139D_2_250f.pdb
172.13 MB
-
shp2_af2_E139D_3_2500f.pdb
1.72 GB
-
shp2_af2_E139D_3_250f.pdb
172.13 MB
-
starting-structure.zip
5.28 MB
Abstract
Dysregulation of the phosphatase SHP2 is implicated in various diseases, including congenital disorders and cancer. SHP2 contains two phosphotyrosine-recognition domains (N-SH2 and C-SH2) and a protein tyrosine phosphatase (PTP) domain. The N-SH2 domain is critical for SHP2 regulation. In the auto-inhibited state, it binds to the PTP domain and blocks the active site, but phosphoprotein engagement destabilizes the N-SH2/PTP domain interaction, thereby exposing the active site. Many disease mutations in SHP2 are at the N-SH2/PTP interface, and they hyperactivate SHP2 by disrupting auto-inhibitory interactions. The activating E139D mutation represents an exception to this mechanism, as it resides in the C-SH2 domain and makes minimal interactions in auto-inhibited and active state crystal structures. As part of this study, we used AlphaFold2 modeling and molecular dynamics simulations to characterize an alternative active conformation of SHP2, in which Glu139 interacts with Arg4 and Arg5 on the N-SH2 domain to stabilize a novel N-SH2/C-SH2 interface. This dataset includes the molecular dynamics trajectories for the E139D mutant in this unique active conformation, as well as Python scripts used to extract select measurements from the trajectories.
Dataset DOI: 10.5061/dryad.0rxwdbscv
Description of the data and file structure
This dataset contains all atom molecular dynamics trajectories analyzing the motions of the tyrosine phosphatase SHP2 with the E139D mutation, starting from an active conformational state generated by AlphaFold2. The original AlphaFold2 structural model was made for the wild-type protein and was reported previously (https://doi.org/10.1038/s41467-025-60641-4) and can be found in a previously published dataset (https://doi.org/10.5061/dryad.83bk3jb18). For this study, we introduced the E139D mutation using Pymol. Then we ran three independent 2.5 μs trajectories all starting from the same structure. Simulations were run using the Amber software package.
Files and variables
File: scripts.zip
Description: This compressed folder contains Python scripts for making various distance and angle measurements using the .pdb format trajectory files. The scripts are all written for batch processing of multiple .pdb files and extracting measurements from sequential states (MD frames) in each file. The scripts utilize the Biopython module to read PDB files.
File: starting-structure.zip
Description: This compressed folder contains 3 files encoding the starting structure for the simulations (an input coordinate .inpcrd file, a parameter .prmtop file, and a PDB-format coordinate file).
File: shp2_af2_E139D_1_250f.pdb
Description: This is a 250-frame PDB format trajectory file derived from extracting structures every 10 ns from a 2.5 μs simulation of SHP2 E139D in an active conformation generated by AlphaFold2. This is the first replicate simulation.
File: shp2_af2_E139D_2_250f.pdb
Description: This is a 250-frame PDB format trajectory file derived from extracting structures every 10 ns from a 2.5 μs simulation of SHP2 E139D in an active conformation generated by AlphaFold2. This is the second replicate simulation.
File: shp2_af2_E139D_3_250f.pdb
Description: This is a 250-frame PDB format trajectory file derived from extracting structures every 10 ns from a 2.5 μs simulation of SHP2 E139D in an active conformation generated by AlphaFold2. This is the third replicate simulation.
File: shp2_af2_E139D_1_2500f.pdb
Description: This is a 2500-frame PDB format trajectory file derived from extracting structures every 1 ns from a 2.5 μs simulation of SHP2 E139D in an active conformation generated by AlphaFold2. This is the first replicate simulation.
File: shp2_af2_E139D_2_2500f.pdb
Description: This is a 2500-frame PDB format trajectory file derived from extracting structures every 1 ns from a 2.5 μs simulation of SHP2 E139D in an active conformation generated by AlphaFold2. This is the second replicate simulation.
File: shp2_af2_E139D_3_2500f.pdb
Description: This is a 2500-frame PDB format trajectory file derived from extracting structures every 1 ns from a 2.5 μs simulation of SHP2 E139D in an active conformation generated by AlphaFold2. This is the third replicate simulation.
Code/software
Python scripts used to extract measurements from the data are in the scripts.zip compressed folder.
The molecular dynamics data were generated using the Amber Molecular Dynamics Package. Data were processed using the CPPTRAJ program within AmberTools. Specific measurements were extracted using Python scripts using the Biopython module. These scripts and trajectory files are included in this dataset.