RECODE: a programmable guide-free C-to-U RNA editing tool
Data files
Nov 07, 2025 version files 305.36 MB
-
chromatograms-1-PPRbackbone.zip
1.81 MB
-
chromatograms-10-RECODEvsRSECUE.zip
20.55 MB
-
chromatograms-11-RECODE-WW2-off-target-KRAS-SMARCA.zip
1.69 MB
-
chromatograms-12-WB-KRAS-PPIB-SMARCA4.zip
2.80 MB
-
chromatograms-13-RESCUE-condition-96-well.zip
3.01 MB
-
chromatograms-14-RECODE-condition-96-well.zip
4.59 MB
-
chromatograms-15-cell-sorting.zip
3.55 MB
-
chromatograms-16-amplicon-sanger-mouse.zip
9.92 MB
-
chromatograms-17-RNA-seq-sanger-human.zip
1.99 MB
-
chromatograms-2-PGcode.zip
48.34 MB
-
chromatograms-3-WWcode.zip
52.32 MB
-
chromatograms-4-gapdh-hepa.zip
1.46 MB
-
chromatograms-5-length-RECODE-PG-WW.zip
17.86 MB
-
chromatograms-6-mutation-Cterm.zip
2.70 MB
-
chromatograms-7-length-RECODE-PG2-WW2.zip
16.64 MB
-
chromatograms-8-mutation_in_PPR_domain_for_RECODE-WW2.zip
64.51 MB
-
chromatograms-9-editing-in-WB-for-CTNNB1.zip
5.93 MB
-
chromatogramsv3.xlsx
408.97 KB
-
EV--RECODE-PG_DESeq2out-M-freqDiff.tsv
2.12 MB
-
EV--RECODE-PG2_DESeq2out-M-freqDiff.tsv
2.16 MB
-
EV--RECODE-WW_DESeq2out-M-freqDiff.tsv
2.21 MB
-
EV--RECODE-WW2_DESeq2out-M-freqDiff.tsv
2.26 MB
-
EV--RECODE-WW2-PPRm1_DESeq2out-M-freqDiff.tsv
2.12 MB
-
EV--RECODE-WW2-PPRm2_DESeq2out-M-freqDiff.tsv
2.12 MB
-
EV--RESCUE_DESeq2out-M-freqDiff.tsv
2.14 MB
-
PBS--PG_I83I_DESeq2out-M-freqDiff.tsv
1.65 MB
-
PBS--WW_F75F_DESeq2out-M-freqDiff.tsv
1.69 MB
-
plasmid-list.xlsx
24.81 KB
-
plasmid-maps.zip
1.79 MB
-
README.md
20.47 KB
-
REPG_S2CTD_CTN6nt_seed_85607_sample_0.cif
226.70 KB
-
REWW_S2CTD_CTN6nt_seed_85607_sample_0.cif
230.14 KB
-
RNA-seq_HEK293T_editing.xlsx
3.41 MB
-
RNA-seq_mouse_editing.xlsx
151 KB
-
Souce-data-images.pdf
16.76 MB
-
tree-P2toDYW-all_al_trim.fa
441.87 KB
-
tree-P2toDYW-all_al_trim.fa.treefile
67 KB
-
values-figures.xlsx
3.71 MB
Abstract
Programmable RNA cytidine deaminase tools have been developed to convert cytidine-to-uridine (C-to-U) using CRISPR systems with guide RNAs. These tools, however, have limitations such as low editing efficiency, limited targetable sequence flexibility, and off-target RNA editing. Here, we present a novel guide-free C-to-U editing tool, named RECODE (RNA Editor for C-to-U with an Optimized DYW Enzyme), based on the RNA-binding pentatricopeptide repeat proteins, naturally fused to a C-terminal DYW cytidine deaminase domain. The RECODE specificity domain was engineered to enable retargeting, while its length and sequence were optimized to reduce off-target effects. Further optimization of the C-terminal catalytic region increased both the editing activity and the translation of the edited RNA. We showed that RECODE efficiently edits a wide range of targets in human cells, without affecting adjacent cytidines. It achieved over 50% editing efficiency for most sites, except those with an upstream guanine. Furthermore, we showed that RECODE is functional in mice, with high editing efficiency observed in specific tissues such as skeletal muscles using an AAV delivery system, suggesting its therapeutic potential for various diseases.
https://doi.org/10.5061/dryad.cjsxksngv
Description of the data and file structure
Phylogenetic tree
File: tree-P2toDYW-all_al_trim.fa
Description: Fasta format of a multiple amino acid sequence alignment of non-DYW:KP protein sequences identified in hornwort, lycophyte, and fern transcriptomes. This alignment was generated using MAFFT in L-INS-i mode and trimmed using TrimAL.
File: tree-P2toDYW-all_al_trim.fa.treefile
Description: Newick format tree of the non-KP C-terminal domain sequences found in hornwort, lycophyte, and fern transcriptomes (Supplementary Figure 1) generated using tree-P2toDYW-all_al_trim.fa file. The branch support was computed with 1000 bootstrap replicates.
The synthetic DYW:PG and DYW:WW C-terminal domains used in this study were designed on DYW subgroups identified in this tree.
Chromatograms
File: chromatogramsv3.xlsx
Description: Excel file describing the ab1 files used in this study, including the EditR values. The ab1 files for each compressed folder below are described in separate sheets.
Columns in each sheet:
- ID: the name reference corresponding to the ab1 file found in the linked zip folders (see zip folders below).
- Plasmid: plasmid transformed in cells or mouse (refer to plasmid-maps.zip and plasmid-list.xlsx for maps and full descriptions).
- sequence/target: the nucleotide sequence recognized and edited by the protein.
- EditR values: U (T peak area), C (C peak area), signi (Y and N if the peak is significant or not), area (manually selected region).
- Orange square: identifies the information that was used to generate the figures in the manuscript.
- Additional information, such as the purpose, is also provided to clarify the experiment or which sample is used.
The editing efficiency of the samples in sheets 2.1 to 3.4 was normalized with the original proteins.
Each sheet number corresponds directly to one zip file below (e.g., sheet '1' corresponds to the 'chromatograms-1' zip file).
File: chromatograms-1-PPRbackbone.zip
Description: Compressed folder containing the ab1 files used to generate Figure 1B.
C-to-U editing efficiency in HEK293T cells of PG and WW proteins fused to a PLSv1, PLSv2, or 10 P-motif domains at the N-terminus. The proteins target the exogenous rpoA editing site naturally edited by CLB19. The site is localized on the same RNA coding for the PPR protein, downstream of the editing site. The editing efficiency was analyzed 24h after transfection.
Values obtained with EditR can be found in sheet "1" in chromatogramsv3.xlsx file.
File: chromatograms-2-PGcode.zip
Description: Compressed folder containing the ab1 files used to generate Figure 1C and Supplementary Figure S8.
C-to-U editing efficiency in HEK293T cells of PLSv2-PG on the ten most common PPR code combinations found in PG and WW PPR-like motifs (P2, L2, S2, and E1 motifs), as well as two PPR code combinations (L2[TR] and E1[GS]) from the original PPR-editor. The proteins target the exogenous rpoA editing site, naturally edited by CLB19. The site is localized on the same RNA coding for the PPR protein, downstream of the editing site. We tested the nucleotide preference of each PPR code combination on the four nucleotides by altering the nucleotide sequence at position -6, -5, -4, and -3 (relative to the editing site, 0) when studying the P2, L2, S2, and E1 motifs, respectively. We hypothesized that high editing efficiency for a specific nucleotide with a given PPR code combination suggests a nucleotide preference for the targeted PPR motif using that specific code. Conversely, similar editing efficiencies for multiple nucleotides suggest that the amino acid combination has low or no nucleotide preference. The editing efficiency was analyzed 24h after transfection.
Values obtained with EditR can be found in sheets "2.1", "2.2", "2.3", and "2.4" in chromatogramsv3.xlsx file for P2, L2, S2, and E1 motifs, respectively.
File: chromatograms-3-WWcode.zip
Description: Compressed folder containing the ab1 files used to generate Figure 1D and Supplementary Figure S9.
C-to-U editing efficiency in HEK293T cells of PLSv2-WW on the ten most common PPR code combinations found in PG and WW PPR-like motifs (P2, L2, S2 and E1 motifs) as well as two PPR code combinations (L2[TR] and E1[GS]) from the original PPR-editor. The proteins target the exogenous rpoA editing site, naturally edited by CLB19. The site is localized on the same RNA coding for the PPR protein, downstream of the editing site. We tested the nucleotide preference of each PPR code combination on the four nucleotides by altering the nucleotide sequence at position -6, -5, -4, and -3 (relative to the editing site, 0) when studying the P2, L2, S2, and E1 motifs, respectively. We hypothesized that high editing efficiency for a specific nucleotide with a given PPR code combination suggests a nucleotide preference for the targeted PPR motif using that specific code. Conversely, similar editing efficiencies for multiple nucleotides suggest that the amino acid combination has low or no nucleotide preference. The editing efficiency was analyzed 24h after transfection.
Values obtained with EditR can be found in sheets "3.1", "3.2", "3.3", and "3.4" in chromatogramsv3.xlsx file for P2, L2, S2, and E1 motifs, respectively.
File: chromatograms-4-gapdh-hepa.zip
Description: Compressed folder containing the ab1 files used to generate Supplementary Figure S10.
Editing efficiency of RECODE in mouse Hepa1-6 cells 24h after transfection. RECODE-PG and RECODE-WW variants, containing 16 P-motifs, were designed to target specific cytidines within the endogenous Gapdh mRNA (F75F and I83I sites).
Values obtained with EditR can be found in sheet "4" in chromatogramsv3.xlsx file.
File: chromatograms-5-length-RECODE-PG-WW.zip
Description: Compressed folder containing the ab1 files used to generate Figure 3B-E.
Influence of the PPR domain length (10, 12, 14, and 16 P-motifs) on RECODE-PG and RECODE-WW activity in HEK293T cells. The editing efficiency was analyzed on-target (CTNNB1-T41I) and nine off-target sites (DNAJA1, HAX1, ITCH, PPA1-1, PPA1-2, RBBP8, SF3B2, TSPAN33 and UBE2D3) 48h after transfection.
Values obtained with EditR can be found in sheet "5" in chromatogramsv3.xlsx file.
File: chromatograms-6-mutation-Cterm.zip
Description: Compressed folder containing the ab1 files used to generate Figure 4B and D.
Influence of targeted mutations in the E1 and E2 motifs of RECODE-PG and RECODE-WW on the editing efficiency. RECODE variants target the endogenous CTNNB1-T41I site in HEK293T cells. The editing efficiency was analyzed 48h after transfection.
Values obtained with EditR can be found in sheet "6" in chromatogramsv3.xlsx file.
File: chromatograms-7-length-RECODE-PG2-WW2.zip
Description: Compressed folder containing the ab1 files used to generate Figure 4F, H, J and K.
Influence of the PPR domain length (10, 12, 14, and 16 P-motifs) on RECODE-PG2 and RECODE-WW2 activity in HEK293T cells. The editing efficiency was analyzed on-target (CTNNB1-T41I) and nine off-target sites (DNAJA1, HAX1, ITCH, PPA1-1, PPA1-2, RBBP8, SF3B2, TSPAN33 and UBE2D3) 48h after transfection.
Values obtained with EditR can be found in sheet "7" in chromatogramsv3.xlsx file.
File: chromatograms-8-mutation_in_PPR_domain_for_RECODE-WW2.zip
Description: Compressed folder containing the ab1 files used to generate Figure 5.
Influence of mutations at position 13 in the PPR motifs of RECODE-WW2 on the off-target editing activity. Editing efficiency in HEK293T cells was analyzed 48h after transfection on CTNNB1-T41I endogenous target and on eight off-target sites (DNAJA1, ITCH, PPA1-1, PPA1-2, RBBP8, SF3B2, TSPAN33, and UBE2D3).
Values obtained with EditR can be found in sheet "8" in chromatogramsv3.xlsx file.
File: chromatograms-9-editing-in-WB-for-CTNNB1.zip
Description: Compressed folder containing the ab1 files used to generate Supplementary Figure 7B.
The editing efficiency in HEK293T cells of RECODE variants on the endogenous CTNNB1-T41I editing site was tested over time (24, 48, and 72h) and compared to RESCUE-S.
Values obtained with EditR can be found in sheet "9" in chromatogramsv3.xlsx file.
File: chromatograms-10-RECODEvsRSECUE.zip
Description: Compressed folder containing the ab1 files used to generate Figure 9A and Supplementary Figure S14.
The editing efficiency in HEK293T cells of four RECODE variants was analyzed on a wide range of targets and compared to RESCUE-S. The targets were selected to represent all four nucleotides at position -6 to +5 relative to the editing site. The editing efficiency was analyzed 48h after transfection.
Values obtained with EditR can be found in sheet "10" in chromatogramsv3.xlsx file.
File: chromatograms-11-RECODE-WW2-off-target-KRAS-SMARCA.zip
Description: Compressed folder containing the ab1 files used to generate Figure 9B.
Editing efficiency in HEK293T cells of RECODE-WW2 variants targeting endogenous KRAS-Q25X and SMARCA4-P88L sites. We analyzed the ability of mutations at position 13 in the PPR motifs to decrease the off-target editing on the target mRNA molecule, 48h after transfection.
Values obtained with EditR can be found in sheet "11" in chromatogramsv3.xlsx file.
File: chromatograms-12-WB-KRAS-PPIB-SMARCA4.zip
Description: Compressed folder containing the ab1 files used to generate Supplementary Figure S15.
Editing efficiency in HEK293T cells of RECODE variants and RESCUE-S on the endogenous KRAS-D30D and SMARCA4-P88L sites 48h after transfection.
Values obtained with EditR can be found in sheet "12" in chromatogramsv3.xlsx file.
File: chromatograms-13-RESCUE-condition-96-well.zip
Description: Compressed folder containing the ab1 files used to generate Supplementary Figure S2A.
Screening conditions for a 96-well plate editing assay in HEK293T cells. Published optimal conditions for RESCUE-S targeting the CTNNB1-T41I editing site resulted in low editing efficiency in our study. To determine the optimal protein:gRNA ratio for a 96-well plate assay, additional conditions were tested by varying the amounts of one or both plasmids. The editing efficiency was analyzed 48h after transfection.
Values obtained with EditR can be found in sheet "13" in chromatogramsv3.xlsx file.
File: chromatograms-14-RECODE-condition-96-well.zip
Description: Compressed folder containing the ab1 files used to generate Supplementary Figure S2B.
Screening conditions for a 96-well plate editing assay in HEK293T cells. The editing efficiency 48h after transfection of four amounts of transfected plasmids was analyzed for four RECODE variants.
Values obtained with EditR can be found in sheet "14" in chromatogramsv3.xlsx file.
File: chromatograms-15-cell-sorting.zip
Description: Compressed folder containing the ab1 files used to generate Supplementary Figure S3.
To test if sorting of transfected cells before extracting RNA significantly increases the editing efficiency for RECODE and RESCUE-S targeting CTNNB1-T41I sites, we analyzed the editing efficiency of cells transfected with RECODE and RESCUE-S plasmids, including an IRES-GFP sequence downstream of the stop codon of the editor. The editing efficiency was analyzed 48h after HEK293T cells transfection
Values obtained with EditR can be found in sheet "15" in chromatogramsv3.xlsx file.
File: chromatograms-16-amplicon-sanger-mouse.zip
Description: Compressed folder containing the ab1 files used to generate Supplementary Figure S4A.
To ensure the reliability of analyzing the editing efficiency of the RECODE variants by direct Sanger sequencing, we compared the editing efficiency obtained by this approach with the values obtained by amplicon sequencing with the same samples. The study included three groups of three 8-week-old male WT mice (FVB/NJcl): a control substance group (PBS group) and two test substance groups (AAV9-RECODE-WW[F75F] and AAV9-RECODE-PG[I83I]). The editing at the endogenous Gapdh-F75F and Gapdh-I83I sites was analyzed eight weeks after AAV administration.
The ab1 files in this zip file correspond to the Sanger sequencing. The amplicon-seq raw data are available in the bioproject PRJNA1200284.
Values obtained with EditR can be found in sheet "16" in chromatogramsv3.xlsx file.
File: chromatograms-17-RNA-seq-sanger-human.zip
Description: Compressed folder containing the ab1 files used to generate Supplementary Figure S4B.
To ensure the reliability of analyzing the editing efficiency of the RECODE variants by direct Sanger sequencing, we compared the editing efficiency obtained by this approach with the values obtained by RNA-sequencing with the same samples. The study included the editing efficiency 48h after HEK293T transfection by RECODE variants and RESCUE-S targeting the endogenous CTNNB1-T41I site.
The ab1 files in this zip file correspond to the Sanger sequencing. The RNA-seq raw data are available in the bioproject PRJNA1200275.
Values obtained with EditR can be found in sheet "17" in chromatogramsv3.xlsx file.
RNA-seq analysis - in vivo
Three groups of three 8-week-old male WT mice (FVB/NJcl) were administered either a control substance (PBS) or two test substances (AAV9-RECODE-WW[F75F] and AAV9-RECODE-PG[I83I]). The off-target effects (editing and gene expression) were analyzed in quadratus femoris (a skeletal muscle sample) 8 weeks after administration.
File: RNA-seq_mouse_editing.xlsx
Description: Summary of in vivo off-target RNA-seq analysis.
The file includes two sheets, each corresponding to the cytidine edited by one of the two RECODE variants.
Column information includes:
- Chromosome, Position, strand: localisation of the editing site
- NC-freq-mean: mean of editing efficiency in three PBS control mice
- RECODE-W-freq-mean or RECODE-P-freq-mean: mean of editing efficiency in three AAV9-RECODE-WW[F75F] or AAV9-RECODE-PG[I83I] mice, respectively.
- Freq-diff: difference of editing efficiency between the mean of RECODE mice and PBS mice.
- sequence: nucleotide sequence 39 nucleotides upstream and downstream of the editing site
File: PBS--PG_I83I_DESeq2out-M-freqDiff.tsv
Description: Summary of in vivo differential gene expression analysis (RECODE-PG vs PBS).
Column information includes:
- geneID: gene identifier
- baseMean: average of the normalized count values across all samples
- log2FoldChange: log2 ratio of gene expression (RECODE treated vs. PBS control)
- lfcSE: standard error of the log2 fold change
- stat: value of the test statistic for the gene
- pvalue: p-value of the test for the transcript
- padj: p-value for multiple testing for the transcript
- position, freqDiffMax: for genes with evidence of editing (cf. RNA-seq_mouse_editing.xlsx file), the position and editing frequency of the first edited site found in that gene (empty if no editing detected)
File: PBS--WW_F75F_DESeq2out-M-freqDiff.tsv
Description: Summary of in vivo differential gene expression analysis (RECODE-WW vs PBS).
For the column information, refer to the description above.
RNA-seq analysis - in vitro
The C-to-U editing off-target effects of six RECODE variants (RECODE-PG, RECODE-PG2, RECODE-WW, RECODE-WW2, RECODE-WW2-PPRm1, RECODE-WW2-PPRm2) and RESCUE-S targeting the NNB1-T41I site and the difference in gene expression compared to the cells transfected with an empty vector were analyzed 48h after HEK293T cells transfection.
File: RNA-seq_HEK293T_editing.xlsx
Description: Summary of in vitro off-target RNA-seq analysis (HEK293T cells).
Each of the seven sheets corresponds to one RECODE variant or RESCUE-S.
Column information includes:
- Chromosome, Position, strand: localisation of the editing site
- NC-freq-mean: mean of editing efficiency in three empty vector samples.
- RECODE-freq-mean or RESCUE-freq-mean: mean of editing efficiency in three biological replicates.
- Freq-diff: difference of editing efficiency between the mean of RECODE/RESCUE and empty vector transfected cells.
- sequence: nucleotide sequence 39 nucleotides upstream and downstream of the editing site.
Note: In the sheet corresponding to RESCUE-S, a 'substitution' column is included. This column differentiates between the C-to-U editing activity (CT) and the A-to-I editing activity (AG), as RESCUE-S catalyzes both reactions.
File: EV--RECODE-PG_DESeq2out-M-freqDiff.tsv
Description: Summary of in vitro differential gene expression analysis (RECODE-PG vs empty vector).
Column information includes:
- geneID: gene identifier
- baseMean: average of the normalized count values across all samples
- log2FoldChange: log2 ratio of gene expression (RECODE vs. empty vector transfected cells)
- lfcSE: standard error of the log2 fold change
- stat: value of the test statistic for the gene
- pvalue: p-value of the test for the transcript
- padj: p-value for multiple testing for the transcript
- position, freqDiffMax: for genes with evidence of editing (cf. RNA-seq_HEK293T_editing.xlsx file), the position and editing frequency of the first edited site found in that gene (empty if no editing detected)
File: EV--RECODE-PG2_DESeq2out-M-freqDiff.tsv
Description: Summary of in vitro differential gene expression analysis (RECODE-PG2 vs empty vector).
For the column information, refer to the description for the "EV--RECODE-PG_DESeq2out-M-freqDiff.tsv" file.
File: EV--RECODE-WW_DESeq2out-M-freqDiff.tsv
Description: Summary of in vitro differential gene expression analysis (RECODE-WW vs empty vector).
For the column information, refer to the description for the "EV--RECODE-PG_DESeq2out-M-freqDiff.tsv" file.
File: EV--RECODE-WW2_DESeq2out-M-freqDiff.tsv
Description: Summary of in vitro differential gene expression analysis (RECODE-WW2 vs empty vector).
For the column information, refer to the description for the "EV--RECODE-PG_DESeq2out-M-freqDiff.tsv" file.
File: EV--RECODE-WW2-PPRm1_DESeq2out-M-freqDiff.tsv
Description: Summary of in vitro differential gene expression analysis (RECODE-WW2-PPRm1 vs empty vector).
For the column information, refer to the description for the "EV--RECODE-PG_DESeq2out-M-freqDiff.tsv" file.
File: EV--RECODE-WW2-PPRm2_DESeq2out-M-freqDiff.tsv
Description: Summary of in vitro differential gene expression analysis (RECODE-WW2-PPRm2 vs empty vector).
For the column information, refer to the description for the "EV--RECODE-PG_DESeq2out-M-freqDiff.tsv" file.
File: EV--RESCUE_DESeq2out-M-freqDiff.tsv
Description: Summary of in vitro differential gene expression analysis (RESCUE-S vs empty vector).
For the column information, refer to the description for the "EV--RECODE-PG_DESeq2out-M-freqDiff.tsv" file.
Gel and blots
File: Souce-data-images.pdf
Description: Uncropped and unedited images corresponding to the protein gels, blots, and REMSA included in the manuscript.
Information on the samples in each lane and the corresponding figure reference is provided for each gel and blot.
Prediction models
File: REPG_S2CTD_CTN6nt_seed_85607_sample_0.cif
Description: Structural prediction model of the C-terminal domain (S2-E1-E2-DYW) of RECODE-PG in complex with CTNNB1-T41I target RNA obtained with Protenix.
File: REWW_S2CTD_CTN6nt_seed_85607_sample_0.cif
Description: Structural prediction model of the C-terminal domain (S2-E1-E2-DYW) of RECODE-WW in complex with CTNNB1-T41I target RNA obtained with Protenix.
Values - figures
File: values-figures.xlsx
Description: Values used to generate the graphs. Each sheet corresponds to a figure.
Plasmids
File: plasmid-list.xlsx
Description: Excel file listing the plasmids used in the study.
The plasmid maps are present in the compressed folder plasmid-genebank.
File: plasmid-maps.zip
Description: Compressed folder containing the plasmid maps in Genebank format for the plasmids listed in the plasmid-list.xlsx table.
Alignment of C-terminal domains and construction of a phylogenetic tree
To design the C-terminal domain, the non-DYW:KP proteins identified in hornwort, lycophyte, and fern transcriptomes (Ichinose et al., 2022) were aligned using MAFFT in L-INS-i mode (v7.407) (Katoh and Standley, 2013). Alignments were then trimmed using TrimAL (v1.4.rev15) (Capella-Gutiérrez et al., 2009) with a minimum conservation threshold and a gap threshold of 20%. A tree was built for the trimmed alignment using IQ-TREE (v.2.0.3) (Minh et al., 2020) with the JTT+F+I+G4 substitution model, selected using the in-built automated test (Kalyaanamoorthy et al., 2017). The branch support was computed with 1000 bootstrap replicates. The tree was visualized using iTOL (Letunic and Bork, 2016).
Transfection
For the design of the PG and WW C-terminal domains and the analysis of several PPR code combinations (at the exogenous rpoA editing site), HEK293T cells were plated in 24-well plates (Thermo Fisher Scientific) at a density of approximately 8.0 × 104 cells/well 24 hrs prior to transfection. A mixture containing 500 ng of transfection plasmid combined with Opti-MEM® I Reduced Serum Medium (Thermo Fisher Scientific), and 1.5 µL of FuGENE® HD Transfection Reagent (Promega) was prepared in a total volume of 25 µL per well. This mixture was incubated at room temperature for 10 minutes before being added to the cells. After transfection, the cells were incubated at 37ºC for 24 hrs.
For the development of RECODE proteins (at the endogenous CTNNB1-T41I site), HEK293T cells were plated in 24-well plates (Thermo Fisher Scientific) at a density of approximately 1.0 × 105 cells/well 24 hrs prior to transfection. A mixture containing 700 ng of RECODE plasmid combined with Opti-MEM® I Reduced Serum Medium (Thermo Fisher Scientific), and 2.5 µL of FuGENE® HD Transfection Reagent (Promega) was added to the cells. After transfection, the cells were incubated at 37ºC for 48 hrs.
For the comparison between RECODE and RESCUE-S, HEK293T cells were plated in 96-well plates coated with poly-L-lysine (IWAKI) at approximately 104 cells/well 24 hrs prior to transfection. 140 ng RECODE plasmid or 90 ng Cas-ADARRESCUE-S / 180 ng gRNA plasmids were transfected with 0.5 µL of FuGENE® HD Transfection Reagent (Promega). The amounts of plasmid DNA for RECODE and RESCUE-S are based on Supplementary Figure S2. After transfection, the cells were incubated at 37ºC for 48 hrs.
For western blotting and RNA-seq analysis, 700 ng RECODE plasmid or 450 ng Cas-ADARRESCUE-S / 900 ng gRNA plasmids were transfected with 2.5 µL of FuGENE® HD Transfection Reagent (Promega) into cells plated in 24-well plates and incubated at 37ºC for 48 hrs.
Hepa1-6 cells were seeded into a well of a 24-well plate at approximately 1.0 × 105 cells/well the day before transfection. 500 ng RECODE plasmid was transfected with Lipofectamine 3000 (Thermo Fisher Scientific) and incubated for 24 hrs at 37ºC.
Analysis of in vitro RNA editing by Sanger sequencing
Total RNA was extracted from transfected cells according to the manufacturer’s instructions using the Maxwell® RSC simplyRNA Tissue Kit (Promega). cDNA was synthesized with 300 ng of RNA, ReverTra Ace® (TYOBO), and random primers. The editing site was PCR-amplified using PrimeSTAR Max DNA® polymerase (TaKaRa) with 1 µL of cDNA and primers. PCR products were purified with ExoSAP-ITTM Express PCR Cleanup Reagent (Thermo Fisher Scientific) and sequenced (Azenta) using forward or reverse primers specific to the gene. RNA editing was quantified from raw sequencing chromatograms using EditR (EditR software: http://baseeditr.com) (Kluesner et al., 2018). Editing efficiency was determined as the ratio of thymidine (T) to T and cytidine (C), based on the percentage of their respective peak areas after trimming low-quality data (P-value cutoff: 0.01). Areas not significantly different from background noise were assigned a value of 0.
In vivo study
The study using wild-type mice was outsourced to SHIN NIPPON BIOMEDICAL LABORATORIES, LTD., which approved the study protocol. The breeding and use of animals was carried out in accordance with the animal experiment regulations of the outsourcing company. The mice were kept under the non-SPF condition and were maintained on a 12-hour light/dark cycle at the controlled room temperature of 21.2–23.0°C with humidity of 40–57% during experiments. The FVB/NJcl strain mice used in this study were purchased from CLEA Japan Inc.
AAV9 packaged RECODE (RECODE-WW[F75F] and RECODE-PG[I83I]) were prepared by VectorBuilder. The study included three groups of three 8-week-old male WT mice (FVB/NJcl): a control substance group (PBS group) and two test substance groups (AAV9-RECODE-WW[F75F] and AAV9-RECODE-PG[I83I]). Mice in each group were administered a single intravenous dose of the test substance at a dose of 3×1014 vg/kg or the same volume of PBS as the AAV preparation as the control substance. Eight weeks after administration, the mice were euthanized by exsanguination under isoflurane inhalation anaesthesia (2.0–4.0%; Isoflurane Inhalation Anesthetic Solution “VTRS,” Mylan EPD LLC), and various tissues were collected and flash-frozen in liquid nitrogen.
RNA extraction from in vivo samples
10 mg of pieces of organs were homogenized in 250 µL of the 1-Thioglycerol/Homogenization (supplied by Maxwell RSC simplyRNA Tissue Kit; Promega) using a BioMasher tube (Funakoshi). The homogenate was centrifuged at 14,000 rpm for 15 min at 4ºC, and 200 µL of supernatant was collected. 200 µL Lysis buffer was added before extracting total RNA using Maxwell RSC simplyRNA Tissue Kit (Promega).
Analysis of transcriptome-wide expression changes
The NGS libraries were prepared from total RNA (as described in the ‘RNA extraction from in vivo samples section’) using poly(A) enrichment, and Illumina Novaseq 6000 sequencing was performed by Azenta. Reads were mapped to a customized reference sequence including GRCm39 (Ensembl) and each AAV9-RECODE sequence for in vivo study and GRCh38.105 (Ensembl) and the RECODE sequences for the in vitro study, using STAR (v2.7.10) (Dobin et al., 2013). Alignment parameters included: --quantMode TranscriptomeSAM –outFilterType BySJout –outFilterMultimapNmax 1 –outSAMstrandField intronMotif –outSAMattributes All. Estimated counts for all the transcripts were generated using RSEM (v1.2.28) (Li and Dewey, 2011) following STAR alignment (as described in the ‘Read mapping’ section). Genes were then pre-filtered to include only those with a CPM (counts per million) > 0.5 in at least half of the samples. Differential gene expression analysis was subsequently performed using DESeq2 (1.46.0) (Love et al., 2014) with default parameters. MA plots were generated by plotting the log2 fold change against the mean expression (baseMean).
Off-target analysis by RNA-seq
PCR duplicates were removed from the alignment generated by STAR (as described in ‘Read mapping’ section) using Picard MarkDuplicates tool (http://broadinstitute.github.io/picard/), RNA editing candidate sites were identified using REDItools (v1.3) (Picardi and Pesole, 2013) using the following parameters -t 17 -e -d -l -U [AG, TC, CT, GA] -G path/to/gtf -p -u -m 30 -T 6-0 -W -v 10 -n 0 -g 2 -s 1 removing the sites with a read count less than 10 and minimum mapping quality score of less than 30. Base frequencies were counted at all positions in the transcript sequences. Positions with significant differences in base frequency compared to the reference were identified using Fisher’s exact test with Benjamini-Hochberg correction (p-value < 0.01).
Protein analysis for in vitro samples
Total protein was extracted with RIPA buffer (ATTO), then adjusted to 1 mg/mL. Western blotting was performed with the Abby Simple Western system (ProteinSimple) according to the manufacturer’s instructions using the 12-230 kDa Separation Module, RePlex Module, and the Anti-Mouse or Anti-Rabbit Detection Module (ProteinSimple). Primary antibodies against β-catenin (sc-59737, Santa Cruz Biotechnology; 1:250), KRAS (12063-1-AP, Proteintech; 1:200) and BRG1/SMARCA4 (21634-1-AP, Proteintech; 1:100) were used. Antibodies were diluted with Can Get Signal Solution 1 (TOYOBO). Total protein was detected using the Total Protein labeling reagent (ProteinSimple). The peak area values of target proteins were normalized to the total protein values using Compass software (ProteinSimple).
Protein analysis for in vivo samples
Total proteins were extracted from 10 mg of a piece of organ with 120 µL RIPA buffer (ATTO) supplemented with 1% Proteinase Inhibitor Cocktail. After homogenization in a BioMasher tube (Funakoshi) and centrifugation at 14,000 rpm for 15 min at 4ºC, 80 µL supernatant and 20 µL 5× SDS buffer were mixed and heated at 95ºC for 5 min. Total protein and standard protein were quantified using the Protein Quantification Assay Kit (Thermo Fisher Scientific) with 0.05 g/ml Ionic Detergent Compatibility Reagent. The protein concentration of the sample was adjusted to 3 mg/mL with 1× SDS buffer, and protein samples were analysed using the Abby Simple Western system (ProteinSimple). Anti-HA monoclonal antibody (clone C29F4, Cell Signaling Technology; 1:500) and Anti-Rabbit Detection Module (ProteinSimple) were used to detect RECODE protein. Anti-GAPDH monoclonal antibody (clone 4F8-2, Thermo Fisher Scientific; 1:2000) and Anti-Mouse Detection Module (ProteinSimple) for Gapdh protein detection. Antibodies were diluted with Can Get Signal Solution 1 (TOYOBO). Total protein was detected with Total Protein labeling reagent (ProteinSimple) after detection of Gapdh protein on the same blot. These total protein values were then used to normalize both Gapdh protein and RECODE proteins, the latter of which were run on a separate blot using the same samples. The signal intensity (area) of the bands corresponding to RECODE or Gapdh protein was quantified and normalized to total protein using Compass software (ProteinSimple).
Protein expression and purification
The DNA sequences encoding His6-SUMO tag (synthesized by Eurofins Genomics) and RECODE’s C-terminal domain were amplified by PCR using primers with Esp3I restriction site. The resulting DNA fragments were subcloned into a modified pET21b+PA vector by Golden Gate assembly with Esp3I. The PPR domain was subsequently cloned using BpiI (as described in ‘Design of RECODE’). The His6-SUMO-RECODE fusion protein was expressed in E. coli Rosetta 2 (DE3) at 15ºC overnight following induction with 0.1 mM IPTG and 0.4 mM ZnSO4. The cells were harvested by centrifugation, and the pellets were resuspended in lysis buffer (50 mM Tris-HCl, pH 8.0, 500 mM NaCl, 20 mM imidazole, and 1 mM dithiothreitol) supplemented with 5 mg/ml lysozyme, then stored at -80ºC until use. The cells were disrupted by sonication followed by centrifugation to remove cell debris. The soluble fraction was applied to Ni-NTA beads (Qiagen) and thoroughly washed with lysis buffer. The SUMO fusion protein was eluted with lysis buffer containing 500 mM imidazole and then cleaved overnight with 5 µg of Ulp1 protease and dialyzed against buffer (50 mM Tris-HCl, pH 8.0, 200 mM NaCl, and 1 mM dithiothreitol). The cleaved protein was subsequently loaded onto a HiTrap Q column (Cytiva), and peak fractions containing the target protein were pooled. Further purification was performed using a Superdex 200 10/300 GL column (Cytiva), equilibrated with 50 mM Tris-HCl, pH 8.0, 200 mM NaCl, and 0.5 mM TCEP. The purified protein was concentrated using an Amicon Ultra centrifugal filter unit (Merck).
RNA electrophoretic mobility shift assay (REMSA)
Cy3- or Cy5-labeled synthetic 31-nt RNA oligonucleotides, CTNNB1-T41I (5’-Cy3-UCUGGAAUCCAUUCUGGUGCCACUACCACAG, the editing site is underlined) and SMARCA4-P88L (5’-Cy5-CAUGAGAAGGGCAUGUCGGACGACCCGCGCU, negative control), were used for REMSA. RNA probes (10 nmol) and purified protein at the indicated concentrations (0, 0.5, 1, 2, 5, 10, 20, 50, 100, and 200 nM) were mixed in 10 µL reaction buffer (20 mM Tris-HCl, pH 8.0, 150 mM NaCl, 0.5% NP40, 1 mM EDTA, pH 8.0 and 1 mM dithiothreitol) to a final reaction volume of 20 µL. The reactions were incubated at 25ºC for 20 min, after which 2 µL of 80% glycerol was added. A 10-µL aliquot of each reaction was loaded onto a 5-20% e-PAGEL (ATTO) and electrophoresed in 1 × TBE buffer at 4ºC. Gels were imaged using an iBright Imaging System (Invitrogen), and the fraction of oligonucleotide bound was quantified with Image Lab software (Bio-Rad).
Prediction of the E1-E2-DYW domain in complex with RNA
The structure of the complex between the E1-E2-DYW domain of RECODE-PG and RECODE-WW targeting the CTNNB1-T41I site and its target (5’-CUACC-3’, with the editing site at the second cytosine) was predicted using Protenix (v0.6.0) on the server (https://protenix-server.com/login) with default parameters (ByteDance AML AI4Science Team et al., 2025). The resulting models were then visualized using PyMOL (https://github.com/schrodinger/pymol-open-source).
