Skip to main content

Complete allele-specific silencing of the gain-of-function mutation of Huntington's disease

Cite this dataset

Lee, Jong-Min (2022). Complete allele-specific silencing of the gain-of-function mutation of Huntington's disease [Dataset]. Dryad.


Dominant gain-of-function mechanism in Huntington's disease (HD) suggests selective inactivation of mutant HTT produces the biggest therapeutic benefit. Here, we developed a complete allele-specific CRISPR/Cas9 strategy to permanently silence mutant HTT through nonsense-mediated decay (NMD), capitalizing on an exonic PAM (protospacer adjacent motif)-Altering SNP (PAS). Comprehensive sequence/haplotype analysis identified PAS-generated NGG PAM sites on exons of common HTT haplotypes in HD patients, revealing a single clinically meaningful PAS-based mutant-specific NMD-CRISPR/Cas9 strategy. The alternative allele of rs363099 eliminates NGG PAM site on the most frequent normal HTT haplotype in HD, permitting mutant HTT-specific CRISPR/Cas9 therapeutics in ~20% of HD patients with European ancestry. Our rs363099-based CRISPR/Cas9 showed perfect allele specificity and good targeting efficiencies in cells derived from HD patients. Dramatically reduced mutant HTT mRNA and complete loss of mutant HTT protein indicate that our allele-specific CRISPR/Cas9 strategy completely inactivates mutant HTT through NMD. RNAseq analysis also supported high levels of on-target gene specificity because no other genes except HTT were altered in clonal lines developed through our NMD-CRISPR/Cas9 strategy. Together, our data demonstrating significant target population, selective inactivation of mutant HTT, good targeting efficiency, and lack of recurrent off-targeting establish its therapeutic value of novel rs363099-based mutant HTT-specific NMD-CRISPR/Cas9 strategy in HD.


We transfected two HD iPSC lines carrying adult onset CAG repeats (42 and 46; both carrying hap.01 and hap.08 diplotype) for mutant-specific NMD-CRISPR/Cas (PX551 vector for spCas9 and PX552 for our test gRNA;  experimental group) or empty vector (PX551 vector spCas9 and empty PX552 vector without gRNA; control group) and subsequently developed single cell clones by limited dilution. 12 clonal lines were developed for each group and further validated by Sanger sequencing and MiSeq analysis of genomic DNA.  Then, genome-wide RNAseq analysis was performed by the Broad Institute. Sequence data were processed by STAR aligner as part of the Broad Institute's standard RNAseq analysis pipeline. Expression levels of genes were based on transcripts per million (TPM) data computed by the TPMCalculator ( Expression levels in 20,260 protein-coding genes based on Ensembl ( were normalized, and subsequently, 3,420 genes were excluded because of zero TPM values in at least one sample. Therefore, we provide 16,840 genes expressed in all 24 samples (HD.NMD-CRISPR.RNAseq.24.Sample.200601.txt).

Three files are associated with this data set:

  1. HD.NMD-CRISPR.RNAseq.24.Sample.200601.txt : RNAseq expression data
  2. NMD-CRISPR.RNAseq.Meta.Data.csv : Sample metadata describing sample characteristics and covariates
  3. README.txt : Description of columns in the metadata file


National Institutes of Health