Data from: Creation of de novo cryptic splicing for ALS/FTD precision medicine
Data files
Sep 16, 2024 version files 6.59 GB
-
96_well_and_rupert.tar.gz
2.23 GB
-
all_data_just_relevant.zip
1.66 GB
-
bsbs.tar.gz
1.65 GB
-
C9D_R1_001.fastq.gz
14.63 MB
-
C9D_R2_001.fastq.gz
13.98 MB
-
C9N_R1_001.fastq.gz
11.88 MB
-
C9N_R2_001.fastq.gz
10.27 MB
-
fastas.tar.gz
8.25 KB
-
gluc_splicing.tar.gz
338.66 MB
-
growthcomp_dream3.tar.gz
69.99 MB
-
new_triple_cryptic_cre.tar.gz
95.64 MB
-
pemax.tar.gz
388.44 MB
-
r3_and_cp1.tar.gz
34.26 MB
-
README.md
14.94 KB
-
scripts.zip
34.58 KB
-
spinal_cord.tar.gz
44.61 MB
-
triple_cryptic_cre.tar.gz
27.30 MB
Abstract
A system enabling the expression of therapeutic proteins specifically in diseased cells would be transformative, providing greatly increased safety and the possibility of pre-emptive treatment. Here we describe “TDP-REG”, a precision medicine approach primarily for amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), which exploits the cryptic splicing events that occur in cells with TDP-43 loss-of-function (TDP-LOF) in order to drive expression specifically in diseased cells. In addition to modifying existing cryptic exons for this purpose, we develop a deep-learning-powered algorithm for generating customisable cryptic splicing events, which can be embedded within virtually any coding sequence. By placing part of a coding sequence within a novel cryptic exon, we tightly couple protein expression to TDP-LOF. Protein expression is activated by TDP-LOF in vitro and in vivo, including TDP-LOF induced by cytoplasmic TDP-43 aggregation. In addition to generating a variety of fluorescent and luminescent reporters, we use this system to perform TDP-LOF-dependent genomic prime editing to ablate the UNC13A cryptic donor splice site. Furthermore, we design a panel of tightly gated, autoregulating vectors encoding a TDP-43/Raver1 fusion protein, which rescue key pathological cryptic splicing events. In summary, we combine deep-learning and rational design to create sophisticated splicing sensors, resulting in a platform that provides far safer therapeutics for neurodegeneration, potentially even enabling preemptive treatment of at-risk individuals.
Methods
Data visualization
All data visualization was performed using R. A markdown script containing code for generating all figures is available at 10.5281/zenodo.11576269.
Cell culture
SK-N-BE(2) and SH-SY5Y cells were grown in DMEM/F12 (Thermo Fisher Scientific) with 10% FBS (Gibco; Thermo Fisher Scientific). HEK293T cells were grown in DMEM Glutamax (Thermo Fisher Scientific) with 10% FBS (Gibco; Thermo Fisher Scientific).
Transfections were performed with Lipofectamine 3000 (Thermo Fisher Scientific), using 20 μl of Lipofectamine and 20 μl of P3000 reagent per microgram of DNA diluted in Opti-Mem (Thermo Fisher Scientific), following the manufacturer protocol.
A clonal SK-N-BE(2) line expressing a doxycycline-inducible shRNA against TDP-43 was generated by transducing cells with SmartVector lentivirus (V3IHSHEG_6494503), followed by selection with puromycin (1 μg ml−1) for one week.
Polyclonal piggyBac lines were generated by co-transfecting the relevant piggyBac vector (backbone from Addgene plasmid #175271) with a vector expressing hyperactive piggyBac transposase (1). A 3:1 ratio of transposase vector to expression vector was used. Selection was performed for at least two weeks in 10 μg/ml blasticidin; a control transfection without the transposase expression vector was performed in parallel, to ensure total cell death of transiently transfected cells after selection.
Western blotting
Adherent cells were washed with PBS, then lysed in RIPA buffer (25 mM Tris-HCI Buffer pH 7.5, 150 mM NaCl, 1% NP-40, 1% sodium deoxycholate, 0.1% sodium dodecyl sulphate). DNA was sheared via sonication using a Bioruptor Pico device. Samples were loaded onto NuPAGE 4-12% Bis-tris gels (Thermo Fisher Scientific) and transferred to a methanol-activated PVDF membrane using a Mini Trans-Blot Cell (BioRad). Membranes were blocked in 5% fat-free powdered milk in PBS-T buffer (0.2% Tween-20). Primary and secondary incubations were 90 min at room temperature, or overnight at 4°C. Chemiluminescence signal was detected by adding HRP (horseradish peroxidase) substrate (Cytiva, RPN2109). All antibodies used are listed in Supplementary Table S1.
For His-tag pulldown, the Dynabeads His-Tag Isolation and Pulldown kit (Thermo Fisher Scientific, 10103D) was used as per the manufacturer’s instructions. Cells were lysed using lysis buffer (50 mM sodium phosphate pH 8.0, 1% Triton X-100, 50 mM NaCl in distilled water with cOmplete EDTA-free protease inhibitor (Roche)), then combined with an equal volume 2X Binding/Wash Buffer (50 mM sodium phosphate pH 8.0, 600 mM NaCl and 0.02% Tween-20 in distilled water). Following this, the solution was mixed with Dynabeads magnetic beads and incubated at room temperature for 5 min. After placing the sample on a magnet for 2 min, the supernatant was aspirated and discarded. Subsequently, four washes were performed using 1X Binding/Wash Buffer (diluted to 1x with distilled water), ensuring each wash included thorough resuspension of the beads. Finally, the protein was eluted with His-Elution Buffer (300 mM imidazole, 50 mM sodium phosphate pH 8.0, 300 mM NaCl and 0.01% Tween-20 in distilled water). The eluted sample was mixed with NuPage LDS Sample buffer (4x) (Thermo Fisher Scientific) and the western blot was performed as described above.
Quantifications of STMN2 were performed using the ImageJ/Fiji gel analysis tool. Quantifications from three polyclonal lines for each construct were used. The ratio of STMN2 to tubulin was first calculated for each lane (‘normalized STMN2’). The ratio of normalized STMN2 (the ratio of ratios) between the untreated sample and the TDP-43 knockdown sample for each line was then calculated.
Cloning
dsDNA fragments were ordered from IDT as GBlocks or EBlocks. PCRs were performed using high-fidelity DNA polymerases (Phusion HF 2x Master Mix or Q5 2x Master Mix; NEB). Plasmid backbones were linearised either via inverse PCR or restriction enzyme digestion. Gibson assembly was performed with 2x HiFi Assembly Master Mix (NEB). Transformation of DNA was performed with Stbl3 bacteria (ThermoFisher Scientific); for the transformation of Gibson assembly products, the reaction mixtures were first purified using SPRI beads to avoid toxicity (Mag-bind TotalPure NGS; Omega-Bio-tek). Ligations were performed using T4 DNA ligase (NEB), following phosphorylation with T4 PNK kinase (NEB). All PCR products that used a plasmid as a template were treated with DpnI (NEB) before downstream steps. All relevant sequences were confirmed either by Sanger sequencing (Source Bioscience or Genewiz) or Nanopore sequencing (Plasmidsaurus or Full Circle). The full plasmid sequences of all plasmids generated in this study are available in Supplementary Table S4.
pPB-EF1a-MegaGate-DD-Blast, which was used as the backbone for piggyBac vectors, was a gift from George Church (Addgene plasmid # 175271 ; http://n2t.net/addgene:175271 ; RRID:Addgene_175271) (2). pCMV-PEmax, which was used as the basis for CE-containing Prime Editing vectors, was a gift from David Liu (Addgene plasmid # 174820 ; http://n2t.net/addgene:174820 ; RRID:Addgene_174820) (3); a Tri-Flag-Tagged version was generated via Gibson assembly. pU6-tevopreq1-GG-acceptor, which was used for cloning pegRNAs, was a gift from David Liu (Addgene plasmid # 174038 ; http://n2t.net/addgene:174038 ; RRID:Addgene_174038) (4). A modified version of the 12QN plasmid was generated via Gibson assembly with the same amino acid sequence as published (5).
For the AARS1-based plasmids (TDP-REGv1), the cryptic exon sequence, plus short flanking sequences near each of the four relevant splice sites, including the TG-repeat region expected to confer TDP-43-mediated regulation, were fused to create a shortened minigene (hg38 chromosome 16 coordinates: 70276506-70276425, 70272982-70272716 and 70272121-70271940, all on the reverse genomic strand). An ‘AA’ dinucleotide was added within the UG-repeat to enable gene synthesis. An extra adenosine was added to the cryptic exon to enable frame-shifting and several point mutations were added to reduce the creation of unintended splice sites, as predicted by SpliceAI.
Cell imaging and quantification
Cells were prepared as described above. 96 well plates were seeded with 10,000 cells per well, then transfected as described above using 100 ng of DNA per well. After 52 hours, cells were imaged using an Incucyte microscope (Sartorius). Images were then analyzed using CellProfiler (6): briefly, red objects were identified using an adaptive threshold (“Robust Background” method), then the total intensity of red signal within these objects was calculated. The mean and standard deviation of the four images for each well, followed by the means of each condition (each construct +/- doxycycline) were calculated, and the ratio +/- doxycycline was calculated.
Quantification of cryptic AARS1 expression in published RNA-sequencing
Publicly available cell line data were aligned using the pipeline described in (7) - briefly, samples were aligned to the GRCh38 genome build using STAR (v2.7.0f) (8) with gene models from GENCODE v31 (9). NYGC ALS consortium RNA-seq data were processed and categorized according to TDP-43 proteinopathy as previously described (7, 10, 11). Counts for specific junctions were tallied by parsing the STAR splice junction output tables using bedtools. Splice junction parsing pipeline is implemented in Snakemake version 5.5.4 and available at: https://github.com/frattalab/bedops_parse_star_junctions. For quantifying the PSI of the AARS1 cryptic exon, we extracted the counts from the STAR splice junction output tables using bedtools (12) for spliced reads mapping to the following coordinates:
chr16 70272882 70276486 AARS1_novel_acceptor -
chr16 70271972 70272796 AARS1_novel_donor -
chr16 70271972 70276486 AARS1_annotated -
We calculated the percent spliced in (PSI) as:
PSI = x 100
SpliceNouveau algorithm
Briefly, the algorithm takes a number of parameters as input; as a minimum, the amino acid sequence to be encoded, and the type and position of the cryptic exon/cryptic splice site, are required. The algorithm then initializes a coding sequence for the given amino acid sequence (or uses an initial sequence provided by the user), and uses SpliceAI (13) to assess its predicted splicing behavior, which is compared to the desired, "ideal" set of splicing predictions. The ideal splicing predictions will vary depending on the type of vector being designed (depending on the user command supplied), but in all cases the constitutive splice sites would ideally have very high scores (close to 100%) whereas the cryptic splice sites could vary from low scores (1-10%) to very high scores (close to 100%) depending on the desired level of cryptic splicing; for a single intron vector with alternative splicing, it may be beneficial to ‘balance’ the relative strengths of the cryptic splice site and the competing splice site; to reduce the risk of off-target splicing, the ideal scores of all other positions in the sequence is 0%. Mutated versions (mutations within the coding sequence are always synonymous) of the sequence are then created and their splicing predictions are calculated. The mutant sequences with the highest "fitness" (the sequence with a splicing prediction most closely matching the ideal sequence, calculated by finding the negative sum of absolute differences between the desired and predicted splicing scores at each position) are retained and used as the basis for subsequent rounds of in silico mutagenesis. As such, the process resembles a "directed evolution" approach but is performed in silico. For single intron designs (excluding intron retention), a competitor splice site is also generated, at a suitable position determined by the algorithm. Additionally, for single introns with alternative 3' splice sites, the coding sequence upstream of the competitor can be automatically modified with synonymous codons to create a high density of pyrimidines, forming an alternative polypyrimidine tract within the coding sequence, to help generate the competitor splice site. To enable TDP-43-mediated repression of the presumed cryptic splice sites, high densities of UG dinucleotides were specified near the relevant splice sites in all cases. The full sequences of all resulting vectors are available in Supplementary Table S4.
Screening of synonymous Cas9-encoding variants
A long ssDNA oligo containing degenerate bases at the relevant codon wobble positions was ordered from IDT as an Ultramer (“cas9_ultramer”). This oligo was then converted to dsDNA via PCR. The AARS1-based reporter was linearized via inverse PCR, deleting the region corresponding to the cas9_ultramer sequence. The two PCR products were combined with Gibson assembly. The Gibson assembly product was purified using SPRI beads and the whole region relevant to splicing (the candidate CE, its flanking introns, and their flanking exonic sequences) was then amplified via PCR. In parallel, pTwist-CMV was linearised using a primer containing a random barcode. The barcoded linearized vector and PCR-amplified library of candidate CE sequences were then combined via Gibson assembly (HiFi Assembly Master Mix; NEB). Following purification with SPRI beads, the mixture was transformed into Stbl3 bacteria (Thermo-Fisher Scientific), which, following recovery, was transferred directly to ampicillin-LB (without a plating step).
Cas9_ultramer: 5’-GTGTGTGTGTCACCCAGRCTNTCNCGNAARCTNATHAAYGGNATHCGNGAYAARCARTCNGGNAARACNATHCTNGAYTTYCTNAARTCNGAYGGNTTYGCNAAYCGNAAYTTYATGCARCTNATHCAYGAYGAYTCNCTNACNTTYAARGARGAYATHCARAARGCNCARGTATGCATCACCCCC-3’
The library of plasmids was purified, then transfected into SK-N-BE(2) cells with or without doxycycline-inducible TDP-43 knockdown. RNA was purified then reverse transcribed using a reverse transcription primer featuring a UMI (unique molecular identifier); this was then amplified via PCR using primers with Illumina-compatible overhangs. The resulting PCR products were sequenced via Illumina sequencing. The reads were analyzed using custom Python and R scripts (code available at 10.5281/zenodo.11576269).
Nanopore analysis
Targeted RT-PCR was performed using vector-specific primers. Barcoding was performed either using custom barcoded primers, or via barcode ligation with kit SQK-NBD114.24. Basecalling was performed using Guppy v6.0.1, with the relevant “SUP” (super-accuracy) model. Demultiplexing was performed using the demultiplex_nanopore_barcodes.py function from nano_tools v0.0.1. Alignment was performed using Minimap2 (v2.1) (14). Pileups were generated using the perform_enhanced_pileup.py function from nano_tools v0.0.1. Splicing analysis was performed by extracting splice junctions from reads using the extract_splice_junctions_from_bam.py function from nano_tools v0.0.1, followed by analysis with custom R scripts (available in 10.5281/zenodo.11576269).
Generation of AAVs
Vectors were generated using Gibson assembly and full nanopore sequencing (Plasmidsaurus) was used to validate the sequences; all vectors featured the human Synapsin promoter. rAAVs were produced by triple transduction of HEK-293T cells essentially as described in Challis et al. (15), with the following exceptions: 2-4 15 cm dishes were used and after the PEG precipitation samples were resuspended in 4 ml, 3 ml chloroform was added, mixed for 2 min by vortexing, and centrifuged at 3000 g for 20 min and the aqueous layer was loaded on iodixanol gradients in a Type 70.1 rotor and centrifuged for 2 h at 52000 RPM. The rAAV sample was collected and buffer exchanged with 1x PBS 5% Sorbitol 0.1 M NaCl (0.25 M NaCl final). Addgene plasmid #103005 was used for AAV production (16). Titers were between 1.0E14 and 3.1E14 genome copies per ml.
Mice
All animal care and experimental procedures were performed in accordance with animal study proposal ASP23-003 approved by the National Institute of Child Health and Human Development Animal Care and Use Committee. TDP-43Fl/wt (Tardbptm1.1Pcw/J ) mice obtained from Dr. Philip Wong at Johns Hopkins University (Jax stock No. 017591) were crossed to homozygosity then crossed to the Chat-IRES-Cre::deltaNeo line (Chattm1(cre)Lowl/J);Jax Stock No. 031661), in which the neomycin resistance cassette was removed to avoid ectopic expression sometimes observed in the ChAT-IRES-Cre line. This produced male TDP-43Fl/wt;Chat-Cre+/+ breeders which were crossed to female TDP-43Fl/Fl mice to generate both TDP-43Fl/Fl;Chat-Cre+/wt and TDP-43Fl/wt;Chat-Cre+/wt that were used in these experiments. The positive control mScarlet AAV was injected into mice with a Sun1-tag (TDP43 fl/fl;Sun1-GFP +/+), which were generated by crossing the previously described TDP43 fl/fl mice to CAG-Sun1/sfGFP mice (B6;129-Gt(ROSA)26Sortm5(CAG-Sun1/sfGFP)Nat/J; Stock No: 021039). The genetic background was C57Bl/6J for all animals. All animal ages are listed in Supplementary Table S2.
Intracerebroventricular Injections
A 10 μl Hamilton syringe (65460-06) with a 33G needle (65461-02) was loaded with up to 10 μl of undiluted virus and placed on a syringe pump (KD Scientific 78-0220). Postnatal day 0-2 pups were anesthetized on ice for approximately 1 minute. After anesthesia, pups were placed on a sterilized mobile surface and advanced such that the Hamilton syringe penetrated the left ventricle. 1 μl of virus was delivered at 1 μl/min into the left ventricle, approximately 1 mm lateral from the sagittal suture. The syringe was kept in place for approximately 30 seconds after the injection removal to minimize backflow. Pups were placed on a heating pad to recover before being returned to their dam in the home cage. Injection details are in Supplementary Table S2.
Tissue Preparation and Sectioning
Mice (age 3-7 weeks) were anesthetized with I.P. injections of 2.5% avertin and transcardially perfused with 10mL of 1X PBS followed by 10mL of 4% paraformaldehyde. Spinal cords were dissected and placed in 4% paraformaldehyde overnight before being placed in a 30% sucrose in PBS cryopreservation solution. After 24 hrs in cryopreservation solution, lumbar spinal cords were embedded in O.C.T (Tissue-Tek) and frozen. Frozen blocks were sectioned into 16 μm-thick coronal slices onto positively charged slides using a Leica CM3050 S Research S Cryostat. Slides were stored at -80ºC for up to 2 weeks.
Immunostaining of Tissue
Slides were removed from -80ºC and thawed to room temperature, then washed in 1X PBS before being placed in citrate buffer (pH 6.0) for antigen retrieval. The solution and slides were microwaved for 45 seconds (until light boil) and allowed to cool back to room temperature. Tissue was then permeabilized in 0.1% Triton-X100 in 1X PBS (PBSTx), then blocked in 5% normal donkey serum in 0.1% PBSTx. Primary antibodies were diluted in 0.5% normal donkey serum in 0.1% PBSTx and incubated overnight at 4°C. Slides were then washed in 0.1% PBSTx and incubated for 1 hr in secondary antibody (ThermoFisher) diluted in 0.1% PBSTx. After a final 1X PBS wash, slides were coverslipped with Prolong Diamond (ThermoFisher P36961) and dried overnight at room temperature in the dark before being stored at 4°C. Primary antibodies: Rat anti-TDP-43 (Biolegend #808301, 1:3000), Rabbit anti-RFP (Rockland #600-401-379, 1:100), Guinea Pig anti-VAChT (Synaptic Systems #139105, 1:500).
Imaging and Analysis
Up to 6 slides were loaded simultaneously onto the Olympus VS200 slide scanner for imaging. Slides were imaged at 20X with 5 z planes, 2 µm apart with the following filer cubes: DAPI, FITC, TRITC, and Cy5. For motor neuron counts, maximum intensity projections were used. Briefly, ROIs were drawn around motor neurons based on VAChT and DAPI by an investigator blind to genotype. To determine RFP positive vs negative motor neurons, the brightness was adjusted so that no RFP was detectable in background spinal cord regions or in between motor neurons. Any motor neurons that still had detectable RFP were considered positive while those with background levels were considered negative.
Cytoplasmic aggregation
A plasmid encoding SNAP-TDP-43-12QN with an L207P mutation in the human TDP-43 sequence was generated; the sequence of the TDP-43 and 12xQN repeat was identical to a published study (18). For confocal imaging, TDP-REGv2 mScarlet reporter construct #7 was co-transfected with SNAP-TDP-43-12QN, at a 1:3 mass ratio of mScarlet reporter:SNAP/TDP plasmid. Cells were plated on Matrigel-coated Ibidi 8-well microscopy dishes, then transfected after 24 hours. Three hours prior to fixation, SNAP-Cell Oregon Green (NEB) prepared in DMSO (in accordance with the manufacturer recommendations) was diluted into growth media (1:2,000 v/v) and the original media was replaced with this media. 60 min prior to fixation, the SNAP-Cell Oregon Green media was replaced with normal growth media. HEK293T cells were fixed 28 hours after transfection, whereas SK-N-BE(2) cells were fixed 50 hours after transfection, to compensate for their lower protein expression levels; fixation was performed in 2% PFA (methanol-free) diluted in PBS for 20 min at room temperature. Cells were permeabilized with 0.1% Triton X100 diluted in PBS for 10 min at room temperature. Blocking was performed for one hour with 0.1% Tween 20 and 2% W/V bovine serum albumin (BSA) diluted in PBS. Primary antibody (TDP-43 antibody; Proteintech) was diluted 1:500 into PBS with 0.1% Tween 20 and cells were incubated overnight at 4°C. After three washes with PBS-Tween 20, secondary antibody (anti-Rabbit IgG conjugated to Alexa-fluor 647; Abcam) was diluted into PBS with 5% w/v BSA and 0.05% Tween 20, and incubation was performed for 60 min at room temperature. Finally, samples were washed and incubated with DAPI for nuclear staining. Imaging was performed on a Zeiss 880 Inverted Confocal using a 63x oil objective lens. To reduce bleed-through from the far-red channel into the red channel, a narrow wavelength cut-off was used for the native mScarlet signal of approximately 575-595 nm.
For imaging with the Incucyte, HEK293T cells were plated into a 96 well plate. The following day, they were transfected with 70 ng of SNAP-12QN-TDP-43, SNAP-TDP-43(Wild-type) or SNP-only, 20 ng of TDP-REGv2:mScarlet reporter plasmid #7 and 10 ng of mGreenLantern plasmid (as a transfection control) (17, 18). Media was changed 24 hours after transfection. Imaging on an Incucyte S5 was performed 48 hours after transfection.
Cre recombinase with multiple cryptic exons
Vectors encoding Cre recombinase containing 1, 2 or 3 cryptic exons were created via Gibson assembly. Vectors were transfected into SK-N-BE(2) cells with/without doxycycline-inducible TDP-43 knockdown. 48 hours after transfection, RNA was extracted and targeted Nanopore sequencing was performed. Reads were analyzed with nano_tools (v0.0.1) and analysis plots were generated in R (code available at 10.5281/zenodo.11576269).
Testing TDP-REG specificity using shRNAs
shRNA sequences were designed by finding the top consensus sequences as predicted by two algorithms (19, 20). The selected shRNAs were cloned into a mir30E locus in a vector via Gibson assembly; each plasmid also encoded blasticidin resistance. SK-N-BE(2) cells were co-transfected with TDP-REGv2 mScarlet construct #10 (30 ng) and one shRNA plasmid (70 ng) using Lipofectamine 3000, following the manufacturer’s recommendations. 24h after transfection, media was replaced with fresh media containing 5 µg/ul blasticidin. Five days after transfection, cells were imaged using an Incucyte S3, using 4x magnification. Fluorescence levels were analyzed using R (code available at 10.5281/zenodo.11576269).
Prime editing
SpliceNouveau was used to design a cryptic exon within the pCMV-PEMax vector. The vector was transfected in SK-N-BE(2) cells with or without TDP-43 knockdown and the splicing of the construct was analyzed via RT-PCR. Primers are listed in Table S3.
The PrimeDesign web tool was used to design the pegRNA and nicking sgRNA (21).
The pegRNA used had sequence 5’-GTAAAAGCATGGATGGAGAGAGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCATGGACTCACGCATCTCTCCATCCATGCCGCGGTTCTATCTAGTTACGCGTTAAACCAACTAGAATTTTTTT-3’
The sgRNA used had sequence 5’-GAAACACCGTGGGGATAAGAGTTCTTTCCGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTT-3’
pCMV-PEMax or a version containing a cryptic exon were transfected into SK-N-BE(2) cells with or without doxycycline-inducible TDP-43 knockdown. 600 ng of prime editing vector was used, in addition to 200 ng of pegRNA, 100 ng of nicking sgRNA and 100 ng of a plasmid expressing mScarlet and the blasticidin resistance gene. 24 hours after transfection, the media was changed and supplemented with 10 µg/ml blasticidin to select for transfected cells. After an additional 48 hours, the cells were subcultured, and after six days total the samples were harvested and gDNA was purified. gDNA was amplified via PCR and amplicons were analyzed by Nanopore sequencing. Pileups were generated using the perform_enhanced_pileup.py function from nano_tools v0.0.1. The fraction of reads containing the expected edit was calculated.
Luciferase analysis
A modified Gaussia luciferase amino acid sequence, in which the methionines are replaced to improve resistance to oxidation, was reverse-translated and optimized by SpliceNouveau, including internal cryptic exons. Vectors were transfected into SK-N-BE(2) cells with or without doxycycline-inducible TDP-43 knockdown. Luciferase activity was measured by extracting an aliquot of cell media and mixing with Pierce™ Gaussia Luciferase Glow Assay Kit (Thermo Fisher Scientific) following the manufacturer protocol. Splicing was analyzed via targeted Nanopore sequencing, using the approach described above.
Design of TDP-43/Raver1 expression vectors
The TDP-43/Raver1 protein sequence used for all experiments was:
MGPKKKRKVEDPGGPAAKRVKLDGGYPYDVPDYAGGMSEYIRVTEDENDEPIEIPSEDDGTVLLSTVTAQFPGACGLRYRNPVSQCMRGVRLVEGILHAPDAGWGNLVYVVNYPKDNKRKMDETDASSAVKVKRAVQKTSDLIVLGLPWKTTEQDLKEYFSTFGEVLMVQVKKDLKTGHSKGFGFVRFTEYETQVKVMSQRHMIDGRWCDCKLPNSKQSQDEPLRSRKVFVGRCTEDMTEDELREFFSQYGDVMDVFIPKPFRAFAFVTFADDQIAQSLCGEDLIIKGISVHISNAEPKHNSNLPPLLGPSGGDREPMGLGPPATQLTPPPAPVGLRGSNHRGLPKDSGPLPTPPGVSLLGEPPKDYRIPLNPYLNLHSLLPSSNLAGKETRGWGGSGRGRRPAEPPLPSPAVPGGGSGSNNGNKAFQMKSRLLSPIASNRLPPEPGLPDSYGFDYPTDVGPRRLFSHPREPTLGAHGPSRHKMSPPPSSFNEPRSGGGSGGPLSHF*
In the above sequence, the SV40 NLS is underlined, the c-Myc NLS is underlined and in italics, the HA-tag sequence is in bold, the TDP-43 sequence is in bold and underlined, and the C-terminal Raver1 sequence is in italics. The 2FL mutation changed the sequence HSKGFGF within RRM1 to HSKGLGL.
SpliceNouveau was used to design constructs with internal cryptic exons within the region encoding TDP-43 within the TDP-43/Raver1 sequence above. For initial screening of their cryptic exon properties, the designed sequences were cloned into a mammalian expression vector (pTwist-CMV) containing the 2FL mutation. These were then transiently transfected into SK-N-BE(2) cells with or without doxycycline-induced TDP-43 knockdown. RNA was extracted, reverse transcribed and RT-PCRs were performed to analyze inclusion of the synthetic TDP-43-encoding cryptic exons (primers listed in Table S3), which was aided by the use of a QIAxcel Advanced machine (QIAGEN).
Rescue of endogenous cryptic splicing with TDP-43/Raver1
The piggyBac system was used to make SK-N-BE(2) cell lines with constitutive EF1A promoters driving expression of constitutive (without a cryptic exon) or two cryptic exon-containing TDP-43/Raver1 constructs, or mScarlet; a PGK promoter drove expression of the blasticidin resistance gene (2). Note that these lines were made using the clonal line that featured the doxycycline-inducible TDP-43 shRNA, enabling doxycycline-inducible TDP-43 knockdown. Polyclonal lines were produced in triplicate. Each polyclonal line was then plated with or without doxycycline treatment (1000 ng/ml) for six days, then harvested for western blotting or RT-PCRs. Western blotting was performed as described above.
RT-PCRs for UNC13A, STMN2 and AARS1 were performed via reverse transcription with Superscript IV (Thermo Fisher Scientific) followed by PCR using either Q5 2x Master mix (NEB) or One-Taq Quickload 2x Master Mix (NEB). UNC13A and AARS1 PCRs used two primers, whereas STMN2 used two reverse primers because the STMN2 cryptic exon induces premature polyadenylation; primer sequences are listed in Table S3. PCR products were electrophoresed on a QIAxcel advanced system. Raw data was exported and analyzed in R using the QIAxcelR package (v0.1) (https://github.com/Delayed-Gitification/QIAxcelR; 10.5281/zenodo.11576269).
Growth competition assay
The piggyBac system was used to generate SK-N-BE(2) cells with stably-integrated, doxycycline-inducible expression of TDP-43/Raver1 fusion, either constitutive or with an internal cryptic exon, or mTag-BFP2 blue fluorescent protein (22). Cell lines were generated in triplicate. 100 ng of each vector, plus 400 ng of hyperactive piggyBac transposase, were used per transfection (per well of a 24-well plate). Note that in this case, the SK-N-BE(2) cells did not feature the doxycycline-inducible TDP-43 shRNA cassette.
Following selection of stable cells in 10 µg/ml blasticidin for 20 days, blasticidin was removed and the different cell lines were mixed, again in triplicate. Each mix was placed into three different wells, with 0, 30 or 1000 ng/μl of doxycycline. After three days, the cells were subpassaged. After a further seven days, the cells were lysed in 20 mM Tris-HCl pH 7.5, 0.5 mM EDTA, 1% Trixon X100 and 500 ng/μl proteinase K; samples were incubated at 55°C for 20 min, then 96°C for 5 min.
A multiplex PCR was performed using two pairs of primers: one pair which amplified the three TDP-43/Raver1 vectors, and another pair which amplified the BFP construct. PCR was performed with Q5 2x Master Mix (NEB), using 2.5 μl of lysate into a 25 μl PCR. Primers are listed in Table S3. The samples were then purified and barcoded Nanopore libraries were prepared using the Native Barcoding 24 kit with R10.4.1 chemistry. The library was sequenced with an R10.4.1 Flongle device and High Accuracy basecalling was used in real-time with Guppy. Reads were then aligned to a “reference genome” consisting of the four constructs with Minimap2 (v2.1) (14), and mapping statistics were calculated by analyzing the resulting bam files in R (scripts available at 10.5281/zenodo.11576269).
iPSC cell culture and differentiation
For iPSC work, the WTC11 iPSC line (GM25256) harboring stable integration of doxycycline-inducible Tet-on neurogenin-2 (NGN2) and dCas9-BFP-KRAB cassettes at the AAVS1 and CLYBL safe-harbor loci, respectively, were used (23). iPSCs were modified to have a HaloTag for TARDBP, and to express TDP-43/Raver1 or mScarlet constructs, as described below.
iPSCs were maintained in E8 Flex Medium (Thermo) in Geltrex (Thermo)-coated plates and passaged with Versene (Thermo) or Accutase (Thermo) when 80% confluent. For induction of iPSCs to i3Neurons, iPSCs were passaged with Accutase (Thermo) and plated onto Geltrex-coated plates with induction media: DMEM/F12 with GlutaMAX (Thermo). 1 x non-essential amino acids (NEAA, Thermo), 2 μg/ml doxycycline hyclate (Sigma), 2 μM XAV939 (Cayman Chemical), 10 μM SB431542 (Cayman Chemical), and 100 nM LDN-193189 (Cayman Chemical). Media was changed daily for three days. For RNA experiments, 12-well plates were coated with poly-D-lysine (10 μg/mL, Gibco) overnight, washed with sterile water, and subsequently coated with laminin overnight (10 μg/mL, Gibco). On the third day, 500k cells were plated per well of the 12-well plate in neuronal media, supplemented with 1x RevitaCell (Thermo): BrainPhys (StemCell Technologies) with 1x N2Max supplement (R&D Systems) 1x N21Max supplement (R&D Systems), 10 ng/mL BDNF (Peprotech), 10 ng/ml GDNF (Peprotech), and 1 μg/mL Laminin (Thermo). 24 hours later, a full media change was performed to remove RevitaCell and to add 300 nM HaloPROTAC-E (University of Dundee) treatment to conditions where TDP-43 knockdown was intended. Differentiated neurons were maintained in neuronal media, and twice-weekly half-media changes were performed. After 14 days, RNA was harvested.
HaloTag editing of TARDBP
iPSCs were electroporated with 10 μg HaloTag-TARDBP homology-directed repair template, 500 pmol Cas12 crRNA (GGAAAAGTAAAAGATGTCTGAAT, IDT) and 20 μg recombinant Cas12 (IDT) using the P3 Primary Cell 4-D-Nucleofector kit (Lonza) and 4D-Nucleofector X unit (Lonza) using the CA-137 program. After electroporation iPSCs were plated in E8 Flex Medium (Thermo) supplemented with 1x RevitaCell (Thermo) and 1x Alt-R HDR Enchancer V2 (IDT) for 24 hours and then expanded in E8 Flex Medium. Cells were labeled with HaloTag-TMR ligand (Promega) and positive clones were selected for genotyping by PCR.
Generation of RAVER iPSC lines
1 million cells were plated per well of a Geltrex-coated six-well plate in E8 Flex media containing 1x RevitaCell. Two hours later, media was changed to E8 Flex without RevitaCell, and iPSCs were transfected with 2.25 mg piggyBac plasmid for TDP-REGv2:TDP-43/Raver1 #6 or #9, or mScarlet control, using Lipofectamine Stem reagent (Thermo Fisher Scientific) along with 0.75 mg hyperactive piggyBac transposase (1). 48 hours post transfection blasticidin selection was started (Sigma) using 6 mg/ml for 24 hours, then 8 mg/ml for 24 hours, and finally 10 mg/ml for two weeks.
UNC13A synapse quantifications
I3 neuron (500K) and rat astrocytes (50k) were co-cultured on PDL/laminin-coated 18mm coverslips and fixed at D35 for 10min in 4% PFA. Cells were permeabilized in 0.1% triton for 10 mins before incubation with primary antibodies in PBS at RT for 1 hr: 1:500 Munc13-1 (Synaptic Systems - 126-104), 1:1000 synapsin (Synaptic Systems - 106-011). Cells were washed in PBS 3x before incubation with secondary antibodies in PBS at RT for 1 hr: mouse-polyclonal 488 (AlexaFluor) and guinea-pig 647 (Alexafluor). Cells were washed in PBS 3x before mounting to glass slides using Mowiol (PolySciences) containing 1:1000 DAPI (ThermoFisher). Imaging was carried out on a Zeiss 980 airyscan confocal microscope at 63x with a 3x crop. Images comprised 7 Z-stacks of 0.5μm intervals. Images were analyzed using FIJI. ROIs were defined based on synapsin-positive puncta and used to measure the mean UNC13A intensity.
Testing correlation of SpliceNouveau optimization with vector performance
56 vectors were designed with four different levels of optimization (14 per optimization level), each using the same SpliceNouveau command, encoding mScarlet with a single intron and an alternative 3’ splice site design. We specified a strong initial donor splice site motif of ‘CAAGTAAG’, which is among the most common motifs at annotated human donor splice sites. The full command was: python3 SpliceNouveau.py --initial_cds ATGGCGAGAACAATGGTTGCTATGGTGTCCAAGGGTGAAGCAGTCATAAAGGAGTTTATGAGGTTCAAGGTGCACATGGAAGGGTCAATGAACGGACATGAGTTCGAAATTGAAGGGGAGGGCGAGGGCCGCCCCTATGAAGGGACACAAACTGCCAAGCTCAAGGTAACCAAGGGGGGACCCCTACCATTCTCATGGGACATTCTGTCCCCGCAATTCATGTATGGTTCTCGTGCATTCACAAAGCATCCTGCTGATATCCCAGACTACTACAAACAATCCTTTCCGGAGGGCTTTAAGTGGGAACGCGTCATGAATTTCGAGGACGGAGGCGCGGTGACGGTCACTCAAGATACCAGCCTAGAGGACGGCACGCTTATTTACAAAGTCAAGCTACGCGGAACGAACTTCCCTCCCGATGGGCCGGTCATGCAAAAGAAAACAATGGGGTGGGAGGCGTCGACCGAGCGCTTGTACCCCGAGGACGGAGTACTAAAGGGAGATATAAAGATGGCATTGCGCCTAAAAGACGGGGGACGATACCTGGCCGACTTCAAGACCACCTACAAGGCCAAGAAGCCCGTGCAGATGCCCGGCGCCTACAACGTGGACCGAAAGCTGGACATCACCAGCCACAACGAGGACTACACCGTGGTGGAGCAGTACGAGAGGAGCGAGGGCAGGCACAGCACCGGCGGCATGGACGAGCTGTACAAGGACTACAAGGACGATGATGACAAA --initial_intron1 GTaagNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTGTGTGTGTGTGTGTGTGTGAATGTGTGTGTGTGTGTGTGNcAG --ce_start 216 --ce_end 216 --five_utr CGGCCGCTTCTTGGTGCCAGCTTATCAtagcgctaccggtcgccacc --three_utr TAATAAACAAATGGTaagGAAGGGCACATCAATCTTTGCTTAATTGTCCTTTACTCTAAAGATGTATTTTATCATACTGAATGCTAAACTTGATATCTCCTTTTAGGTCATTGATGTCCTTCACCCCGGGAAGGCGACAGTGCCTAAGACAGAAATTCGGGAAAAACTAGCCAAAATGTACAAGACCACACCGGATGTCATCTTTGTATTTGGATTCAGAACTCAGTAAACTGGATCCGCAGGCCTCTGCTAGCTTGACTGACTGAGATACAGCGTACCTTCAGCTCACAGACATGATAAGATACATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGGTGTGGGAGGTTTTTTAA --ignore_end 470 --aa generate_it --upstream_mut_chance 0.3 --downstream_mut_chance 1 -a 30 --intron1_mut_chance 0.5 -n 3000 --cds_mut_start_trim 160 --cds_mut_end_trim 396 --overwrite --ce_score_weight 3 --early_stop 500 --target_const_donor 1 --target_const_acc 0.5 --intron1_mut_chance 0.5 --alt_3p --overwrite --alt_position in_exon --alt_3p_end_trim 471 --downstream_mut_n 1 --target_cryptic_acc 0.75 --min_alt_dist 40 --track_splice_scores
These were cloned into a backbone containing dual barcodes (in the 5’ and 3’ UTRs) featuring A, C or T bases in the forward strand (avoidance of G ensures no cryptic splice sites will be created within the barcode. A pool of ~500 plasmids was created, each with a unique barcode combination and with a single vector design. These were transfected into SK-N-BE(2) cells with or without TDP-43 knockdown. Following RT-PCR, the RNA products from these cells were sequenced using an R10.4.1 Minion flowcell. Additionally, a sample of the plasmid pool was digested using PacI and sequenced in parallel. Following basecalling (Guppy, super-accuracy) the reads were aligned to a ‘genome’ containing all 56 plasmid designs, and barcode pairs were assigned to plasmids using the reads derived from the plasmid DNA. These barcode pairs were then used to assign the RNA-derived reads to plasmids. The mapped reads were analyzed using nano_tools and R (all code available in the R markdown; 10.5281/zenodo.11576269).
Nanopore analysis of TDP-REVv2:TDP-Raver1 #9 in mouse spinal cord
Mouse spinal cords were removed and flash-frozen in liquid nitrogen. Each cord was lysed in 1 ml of RLT-plus buffer (Qiagen), then homogenized using a glass tissue grinder, followed by centrifugation through a QIAshredder column. The flow-through was processed using the RNeasy Plus kit (Qiagen), following the manufacturer’s protocol. RNA was reverse transcribed using Superscript IV and random hexamers, following the manufacturer protocol, followed by RT-PCRs against the exonic vector sequences flanking the TDP-43-encoding CE. PCR products were purified and sequenced on a R10.4.1 Flongle using the SQK-NBD114.24 kit, basecalled as described, and analyzed using Minimap2 and nano-tools v0.0.1.
References
1. K. Yusa, L. Zhou, M. A. Li, A. Bradley, N. L. Craig, A hyperactive piggyBac transposase for mammalian applications. Proc. Natl. Acad. Sci. U. S. A. 108, 1531–1536 (2011).
2. C. Kramme, A. M. Plesa, H. H. Wang, B. Wolf, M. P. Smela, X. Guo, R. E. Kohman, P. Chatterjee, G. M. Church, An integrated pipeline for mammalian genetic screening. Cell Rep Methods 1, 100082 (2021).
3. P. J. Chen, J. A. Hussmann, J. Yan, F. Knipping, P. Ravisankar, P.-F. Chen, C. Chen, J. W. Nelson, G. A. Newby, M. Sahin, M. J. Osborn, J. S. Weissman, B. Adamson, D. R. Liu, Enhanced prime editing systems by manipulating cellular determinants of editing outcomes. Cell 184, 5635-5652.e29 (2021).
4. J. W. Nelson, P. B. Randolph, S. P. Shen, K. A. Everette, P. J. Chen, A. V. Anzalone, M. An, G. A. Newby, J. C. Chen, A. Hsu, D. R. Liu, Engineered pegRNAs improve prime editing efficiency. Nat. Biotechnol. 40, 402–410 (2022).
5. M. Budini, V. Romano, Z. Quadri, E. Buratti, F. E. Baralle, TDP-43 loss of cellular function through aggregation requires additional structural determinants beyond its C-terminal Q/N prion-like domain. Hum. Mol. Genet. 24, 9–20 (2015).
6. A. E. Carpenter, T. R. Jones, M. R. Lamprecht, C. Clarke, I. H. Kang, O. Friman, D. A. Guertin, J. H. Chang, R. A. Lindquist, J. Moffat, P. Golland, D. M. Sabatini, CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 7, R100 (2006).
7. A.-L. Brown, O. G. Wilkins, M. J. Keuss, S. E. Hill, M. Zanovello, W. C. Lee, A. Bampton, F. C. Y. Lee, L. Masino, Y. A. Qi, S. Bryce-Smith, A. Gatt, M. Hallegger, D. Fagegaltier, H. Phatnani, NYGC ALS Consortium, J. Newcombe, E. K. Gustavsson, S. Seddighi, J. F. Reyes, S. L. Coon, D. Ramos, G. Schiavo, E. M. C. Fisher, T. Raj, M. Secrier, T. Lashley, J. Ule, E. Buratti, J. Humphrey, M. E. Ward, P. Fratta, TDP-43 loss and ALS-risk SNPs drive mis-splicing and depletion of UNC13A. Nature 603, 131–137 (2022).
8. A. Dobin, C. A. Davis, F. Schlesinger, J. Drenkow, C. Zaleski, S. Jha, P. Batut, M. Chaisson, T. R. Gingeras, STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
9. A. Frankish, M. Diekhans, A.-M. Ferreira, R. Johnson, I. Jungreis, J. Loveland, J. M. Mudge, C. Sisu, J. Wright, J. Armstrong, I. Barnes, A. Berry, A. Bignell, S. Carbonell Sala, J. Chrast, F. Cunningham, T. Di Domenico, S. Donaldson, I. T. Fiddes, C. García Girón, J. M. Gonzalez, T. Grego, M. Hardy, T. Hourlier, T. Hunt, O. G. Izuogu, J. Lagarde, F. J. Martin, L. Martínez, S. Mohanan, P. Muir, F. C. P. Navarro, A. Parker, B. Pei, F. Pozo, M. Ruffier, B. M. Schmitt, E. Stapleton, M.-M. Suner, I. Sycheva, B. Uszczynska-Ratajczak, J. Xu, A. Yates, D. Zerbino, Y. Zhang, B. Aken, J. S. Choudhary, M. Gerstein, R. Guigó, T. J. P. Hubbard, M. Kellis, B. Paten, A. Reymond, M. L. Tress, P. Flicek, GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).
10. M. Prudencio, J. Humphrey, S. Pickles, A.-L. Brown, S. E. Hill, J. M. Kachergus, J. Shi, M. G. Heckman, M. R. Spiegel, C. Cook, Y. Song, M. Yue, L. M. Daughrity, Y. Carlomagno, K. Jansen-West, C. F. de Castro, M. DeTure, S. Koga, Y.-C. Wang, P. Sivakumar, C. Bodo, A. Candalija, K. Talbot, B. T. Selvaraj, K. Burr, S. Chandran, J. Newcombe, T. Lashley, I. Hubbard, D. Catalano, D. Kim, N. Propp, S. Fennessey, NYGC ALS Consortium, D. Fagegaltier, H. Phatnani, M. Secrier, E. M. Fisher, B. Oskarsson, M. van Blitterswijk, R. Rademakers, N. R. Graff-Radford, B. F. Boeve, D. S. Knopman, R. C. Petersen, K. A. Josephs, E. A. Thompson, T. Raj, M. Ward, D. W. Dickson, T. F. Gendron, P. Fratta, L. Petrucelli, Truncated stathmin-2 is a marker of TDP-43 pathology in frontotemporal dementia. J. Clin. Invest. 130, 6080–6092 (2020).
11. O. H. Tam, N. V. Rozhkov, R. Shaw, D. Kim, I. Hubbard, S. Fennessey, N. Propp, NYGC ALS Consortium, D. Fagegaltier, B. T. Harris, L. W. Ostrow, H. Phatnani, J. Ravits, J. Dubnau, M. Gale Hammell, Postmortem Cortex Samples Identify Distinct Molecular Subtypes of ALS: Retrotransposon Activation, Oxidative Stress, and Activated Glia. Cell Rep. 29, 1164-1177.e5 (2019).
12. A. R. Quinlan, I. M. Hall, BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
13. K. Jaganathan, S. Kyriazopoulou Panagiotopoulou, J. F. McRae, S. F. Darbandi, D. Knowles, Y. I. Li, J. A. Kosmicki, J. Arbelaez, W. Cui, G. B. Schwartz, E. D. Chow, E. Kanterakis, H. Gao, A. Kia, S. Batzoglou, S. J. Sanders, K. K.-H. Farh, Predicting Splicing from Primary Sequence with Deep Learning. Cell 176, 535-548.e24 (2019).
14. H. Li, Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
15. R. C. Challis, S. R. Kumar, K. Y. Chan, C. Challis, M. J. Jang, P. S. Rajendran, J. D. Tompkins, K. Shivkumar, B. E. Deverman, V. Gradinaru, Widespread and targeted gene expression by systemic AAV vectors: Production, purification, and administration, bioRxiv (2018)p. 246405.
16. K. Y. Chan, M. J. Jang, B. B. Yoo, A. Greenbaum, N. Ravi, W.-L. Wu, L. Sánchez-Guardado, C. Lois, S. K. Mazmanian, B. E. Deverman, V. Gradinaru, Engineered AAVs for efficient noninvasive gene delivery to the central and peripheral nervous systems. Nat. Neurosci. 20, 1172–1179 (2017).
17. D. S. Bindels, L. Haarbosch, L. van Weeren, M. Postma, K. E. Wiese, M. Mastop, S. Aumonier, G. Gotthard, A. Royant, M. A. Hink, T. W. J. Gadella Jr, mScarlet: a bright monomeric red fluorescent protein for cellular imaging. Nat. Methods 14, 53–56 (2017).
18. B. C. Campbell, E. M. Nabel, M. H. Murdock, C. Lao-Peregrin, P. Tsoulfas, M. G. Blackmore, F. S. Lee, C. Liston, H. Morishita, G. A. Petsko, mGreenLantern: a bright monomeric fluorescent protein with rapid expression and cell filling properties for neuronal imaging. Proc. Natl. Acad. Sci. U. S. A. 117, 30710–30721 (2020).
19. R. Pelossof, L. Fairchild, C.-H. Huang, C. Widmer, V. T. Sreedharan, N. Sinha, D.-Y. Lai, Y. Guan, P. K. Premsrirut, D. F. Tschaharganeh, T. Hoffmann, V. Thapar, Q. Xiang, R. J. Garippa, G. Rätsch, J. Zuber, S. W. Lowe, C. S. Leslie, C. Fellmann, Prediction of potent shRNAs with a sequential classification algorithm. Nat. Biotechnol. 35, 350–353 (2017).
20. C. Zhao, N. Xu, J. Tan, Q. Cheng, W. Xie, J. Xu, Z. Wei, J. Ye, L. Yu, W. Feng, ILGBMSH: an interpretable classification model for the shRNA target prediction with ensemble learning algorithm. Brief. Bioinform. 23 (2022).
21. J. Y. Hsu, J. Grünewald, R. Szalay, J. Shih, A. V. Anzalone, K. C. Lam, M. W. Shen, K. Petri, D. R. Liu, J. K. Joung, L. Pinello, PrimeDesign software for rapid and simplified design of prime editing guide RNAs. Nat. Commun. 12, 1034 (2021).
22. O. M. Subach, P. J. Cranfill, M. W. Davidson, V. V. Verkhusha, An enhanced monomeric blue fluorescent protein with the high chemical stability of the chromophore. PLoS One 6, e28674 (2011).
23. R. Tian, M. A. Gachechiladze, C. H. Ludwig, M. T. Laurie, J. Y. Hong, D. Nathaniel, A. V. Prabhu, M. S. Fernandopulle, R. Patel, M. Abshari, M. E. Ward, M. Kampmann, CRISPR interference-based platform for multimodal genetic screens in human iPSC-derived neurons. Neuron 104, 239-255.e12 (2019).