Data from: Mechanism of expanded DNA recognition in xCas9
Data files
Jan 30, 2025 version files 1.23 GB
-
data_deposition_eLife.zip
1.23 GB
-
README.md
3.83 KB
Abstract
xCas9 is an evolved variant of the CRISPR-Cas9 genome editing system, engineered to improve specificity and reduce undesired off-target effects. How xCas9 expands the DNA targeting capability of Cas9 by recognizing a series of alternative Protospacer Adjacent Motif (PAM) sequences while ignoring others is unknown. Here, we elucidate the molecular mechanism underlying xCas9's expanded PAM recognition and provide critical insights for expanding DNA targeting. We demonstrate that while wild-type Cas9 enforces stringent guanine selection through the rigidity of its interacting arginine dyad, xCas9 introduces flexibility in R1335, enabling selective recognition of specific PAM sequences. This increased flexibility confers a pronounced entropic preference, which also improves recognition of the canonical TGG PAM. Furthermore, xCas9 enhances DNA binding to alternative PAM sequences during the early evolution cycles, while favouring binding to the canonical PAM in the final evolution cycle. This dual functionality highlights how xCas9 broadens PAM recognition and underscores the importance of fine-tuning the flexibility of the PAM-interacting cleft as a key strategy for expanding the DNA targeting potential of CRISPR-Cas systems. These findings deepen our understanding of DNA recognition in xCas9 and may apply to other CRISPR-Cas systems with similar PAM recognition requirements.
README: xCas9: Molecular dynamics
https://doi.org/10.5061/dryad.0000000dt
Description of the data and file structure
Processed source data are available in text and XLSX formats, corresponding to each Figure (Fig. 2 to 5 and their supplementary Figs in the article).
The corresponding MD trajectories (each replica) are deposited in XTC format, with structures in PDB format. They can be visualized together using programs such as VMD and PyMOL.
The TPR files can be used to conduct or verify analyses using GROMACS.
Description of each file or subdirectory:
Figure_2-source_data_1_for_panelB
: Data represent interaction pattern established by R1333 and R1335 with PAM nucleobases (NB), PAM backbone (BB) and non-PAM nucleotides in SpCas9 bound to TGG (i.e., the wilt-type system) and xCas9 bound to PAM sequences that are recognized (TGG, AAG, GAT) and ignored (CCT, TTA, ATC).Figure_2-source_data_2_for_panelD
: Data of frequencies of hydrogen bond formation between the arginine side chains and the PAM NB, BB and non-PAM nucleotides.-
Figure_2-source_data_3_for_panelE
: Specificity index data (along with standard error of mean, calculated over four replicates), representing the frequency of hydrogen bond formation between a given arginine and the PAM nucleotides relative to the frequency of forming hydrogen bonds with non-PAM residues. Figure_2-source_data_4_for_panelC
: Data of root mean square fluctuations (RMSF) of the R1333 and R1335 side chains in SpCas9 bound to its TGG PAM, compared to xCas9 bound to recognized and ignored PAMs.Figure_2-source_data_5
: MD trajectories and reference structure files corresponding to data mentioned in Figure 2.Figure_2-supplement_1_source_data
: Frequencies of hydrogen bond formation across four simulation replicates.Figure_3-source_data_1
: Data of free energy surface (FES) describing the preference of R1335 for binding either the G3 nucleobase or E1219 in SpCas9, and FES of the binding of R1335 to G3 and the DNA backbone in SpCas9 and xCas9.Figure_3-supplement_1_source_data
: Convergence data (100 to 1000 ns) of well-tempered metadynamics simulations characterizing the preference of R1335 for binding either the G3 nucleobase or E1219 in SpCas9.Figure_3-supplement_2_source_data
: Convergence data (100 to 1000 ns) of well-tempered metadynamics simulations characterizing the binding of R1335 to G3 and the DNA backbone in SpCas9.Figure_3-supplement_3_source_data
: Convergence data (100 to 1000 ns) of well-tempered metadynamics simulations characterizing the binding of R1335 to G3 and the DNA backbone in xCas9.Figure_4-source_data_1
: DNA binding free energy difference (ΔΔG) data between SpCas9 and its xCas91-3
mutants.Figure_4-source_data_2
: Alchemical trajectories (that are used to calculate ΔΔG) and reference structure files.Figure_5-source_data_1_panelA
: Enthalpic contribution to the ΔΔG of DNA binding while transitioning from SpCas9 to xCas91
in the presence of the TGG, AAG and GAT PAM sequences, computed as the average changes in the interaction energy (ΔE) between selected amino acid residues and the DNA.Figure_5-source_data_2_panelB
: Equilibrium MD trjactories of SpCas9 (PDB 4UN3) incorporated the xCas9 mutations in the presence of TGG and AAG PAM sequences.Figure_5-source_data_3_panelB
: Data corresponds to the distance (r) between the centers of mass (COM) of the REC3 and HNH domains.Figure_5-supplement_2_source_data
: Enthalpic contribution to the DNA binding free energy for the xCas91
→ xCas92
transformation.Figure_5-supplement_3_source_data
: Enthalpic contribution to the DNA binding free energy for the xCas92
→ xCas93
transformation.
Methods
The dataset was created using molecular dynamics simulations of SpCas9 and xCas9 proteins. Data has been processed using standard Python libraries and the GROMACS.