Analysis of rod-cone dystrophy genes reveals unique mutational patterns
Data files
Jan 11, 2023 version files 69.58 KB
-
Data_RCD_genesassociation_Col.csv
-
labels_Col.txt
-
README_Data_RCD_genesassociation_Col.txt
Abstract
Background
Rod-cone dystrophy (RCD) is the most common inherited retinal disease that is characterised by the progressive degeneration of retinal photoreceptors. RCD genes' classification is based exclusively on gene mutations’ prevalence and does not consider the implication of the same gene in different phenotypes. Therefore, we first investigated the mutations occurrence in autosomal recessive RCD (arRCD) and non-arRCD conditions. Then, finally, we identified arRCD enriched mutational patterns in specific genes and coding exons.
Methods and results
The mutations' patterns differed according to arRCD (p=0.001). Specifically, When compared with missense; insertions/deletions (OR=1.2, p=0.007), nonsense (OR=1.2, p=0.014) and splice-site mutations (OR=1.6, p=0.038) increased the OR of arRCD by 20%–60% versus non-arRCD conditions. The gene-based analysis identified that EYS, IMPG2, RP1L1 and USH2A mutations were enriched in arRCD (p<0.05). The exon-based analysis revealed specific mutation patterns in exons of CRB1, RP1L1 and exons 12, 60 and 62 coding for Lamin EGF and FTIII domains of USH2A.
Conclusion
The current analysis showed that many aRCD genes have unique mutational patterns.
Methods
Data extraction, inclusion, and exclusion criteria
The Retinal Information Network Database
The Retinal Information Network (Retnet) is a database that provides tables of genes and loci causing IRDs (https://sph.uth.edu/retnet/). Thus, it was used to search the arRCD genes. In total, sixty-three genes were found (ABCA4, AGBL5, AHR, ARHGEF18, ARL6, ARL2BP, BBS1, BBS2, BEST1, C2orf71, C8orf37, CERKL, CLCC1, CLRN1, CNGA1, CNGB1, CRB1, CYP4V2, DHDDS, DHX38, EMC1, EYS, FAM161A, GPR125, HGSNAT, IDH3B, IFT140, IFT172, IMPG2, KIAA1549, KIZ, LRAT, MAK, MERTK, MVK, NEK2, NEUROD1, NR2E3, NRL, PDE6A, PDE6B, PDE6G, POMGNT1, PRCD, PROM1, RBP3, REEP6, RGR, RHO, RLBP1, RP1, RP1L1, RPE65, SAG, SAMD11, SLC7A14, SPATA7, TRNT1, TTC8, TULP1, USH2A, ZNF408, ZNF513) (last accessed on June, 10, 2021).
HGMD database
Our sample population is individuals affected with rod-cone dystrophy. The genotype of these individuals was determined through various genotyping and sequencing techniques (genotyping arrays) such as next-generation sequencing (targeted and whole exome) and Sanger sequencing. Genetic variations in the sixty-three arRCD genes were downloaded in .txt format, with information including c.DNA position, protein position, class, associated phenotype, and corresponding reference (N=7,382). Mutations associated with 'retinal dystrophy' or 'retinal degeneration' and ‘retinal disease’ were not included in the analysis since these terms are broad and do not allow a correct diagnosis. This led to a total of 6,627 mutations (http://www.hgmd.cf.ac.uk/ac/index.php, accessed: September 10, 2021). We have removed the Rhodopsin mutations that were reported to have a dominant effect. Furthermore, we have removed the ‘duplicate’ mutations; these are different DNA mutations that lead to the same amino acid exchange in a gene. This filtering kept 5,868 mutations. For every mutation, we added a type (missense, nonsense, insertion/deletion (InDel), or splice site) based on the HGMD annotation.
LOVD database
Genetic variations in arRCD genes were also downloaded from the LOVD database (N=1,104; https://www.lovd.nl/, accessed: September 20,2021).
UniProt and Gene databases
All amino acid (a.a) domains were retrieved from the Uniprot database https://www.uniprot.org/. On the other hand, the longest mRNA isoform from the NCBI gene database was selected (http://www.ncbi.nlm.nih.gov/gene).
Statistical analyses
The analyses were conducted using SPSS software version 20 (SPSS, Inc., IL, USA). All studied variables were expressed as frequencies. The plots were generated using Origin software (OriginPro, Version 8, OriginLab Corporation, Northampton, MA, USA).
Mutations distribution stratified according to autosomal recessive rod-cone dystrophy
The number of unique mutations in HGMD database was compared according to arRCD phenotype (arRCD vs. non arRCD). Data from the LOVD database was used in the analysis of the total mutations only. Individual mutations and not their frequencies were used in the distribution analysis, and thus even if more than one affected individual carried a mutation, it was counted once. However, the genetic heterogeneity at the mutational level was considered. Thus, the mutations causing more than one phenotype were counted more than once according to their number of associations. The chi-square (χ2) goodness of fit test was used to compare the mutations number (in global, per gene, and per exon) according to arRCD.
Usage notes
Microsoft Excell and SPSS