Unraveling the genetics of feline hypertrophic cardiomyopathy: A multiomics study of 138 cats
Data files
Jun 26, 2025 version files 51.56 GB
-
hcm_cat_RNA_read_counts.csv
10.56 MB
-
hcm_cat_sra.vcf.gz
51.54 GB
-
hcm_cat_sra.vcf.gz.tbi
2.33 MB
-
README.md
2.76 KB
Abstract
Hypertrophic cardiomyopathy (HCM) is the most common inherited cardiac disease in cats, often leading to congestive heart failure, arterial thromboembolism, and sudden cardiac death. The genetics of feline HCM are poorly understood, and limited genetic discoveries remain breed or family-specific. We aimed to identify novel causative or disease-modifying variants in a large cohort of cats reflective of the general cat population. In a second cohort, we sought to characterize transcriptomic differences between HCM-affected cats and healthy controls. DNA was isolated from 138 domestic cats (109 HCM and 29 controls). No single or combination of variants of high, moderate, or modifying impact were identified in genome-wide analysis to cause or modify the disease severity of HCM. Several rare high and moderate-impact variants in genes associated with human HCM were detected in diseased cats. In a second cohort, left ventricular (LV), interventricular septal (IVS), and left atrial (LA) tissues of 27 HCM-affected and 15 control cats were submitted for stranded mature RNA-sequencing at 50 million reads/sample. A total of 74, 115, and 45 DEGs were upregulated and 8, 53, and 48 DEGs were downregulated in LVPW, IVS, and LA tissue, respectively, in HCM-affected cats compared to controls. Similar to humans, the genetic etiology of feline HCM remains unknown in a high proportion of cases. Transcriptomics revealed molecular signatures that may help identify novel HCM biomarkers or drug targets in future investigations.
Dataset DOI: 10.5061/dryad.cjsxksnjh
Description of the data and file structure
Data available
1. A population level vcf of polymorphic SNP and indel variants were called among 138 domestic cats with and without hypertrophic cardiomyopathy (HCM). The VCF was generated by mapping paired wgs fastq reads to the Fca126 reference genome with bwa mem and calling variants through GATK4 best practices. Variant annotations were generated with Ensembl’s VEP based on Fca126 gene and exon boundaries. The vcf file contains meta-information lines, followed by a header line specifying fixed fields per sample and subsequent data lines detail variants at genomic positions. The fixed fields include chromosome (CHROM), position (POS), identifier (ID), the reference base(s) (REF), alternate base(s) (ALT), quality (QUAL), filter status (FILTER), and additional information (INFO). Genotypes are encoded as allele values separated by ‘/’. The tabix is also included.
NCBI BioSample names are used as sample identifiers in the vcf. Metadata, including age, sex, breed, and HCM phenotype for each individual sample is included in the supplemental material of the accompanying manuscript.
2. An RNASeq read count matrix generated from the RNA extracted from the left ventricle posterior wall (LVPW), left atrium (LA), and/or interventricular septum (IVS) of cats with and without hypertrophic cardiomyopathy. The read count matrix was generated by mapping paired RNASeq reads to the Fca126 reference genome with STAR and counting reads with htseq. The column header of the matrix is the sample names and rows names are gene names. Read counts from every sample in the study is include.
Sample Ids are in a BioSample-tissue-phenotype format, where tissue = LVPW, LA, IVS and phenotype = A; affected or C; control. Metadata for each sample including age, sex, breed, and HCM phenotype is available in the supplemental material of the accompanying manuscript.
Files and variables
File: hcm_cat_RNA_read_counts.csv
Description: RNASeq read counts from the LA, IVS and LVPW samples.
Rows: Represent different genes (e.g., LOC102899061, TRNAE-UUC, CCDC115).
Columns: Represent samples from different tissues or regions, associated with the sample IDs (e.g., SAMN45893915_IVS-A)
Additional Variables:
- Gene symbol is defined by the Fcat126 gtf, available from NCBI.
File: hcm_cat_sra.vcf.gz
Description: WGS population vcf file with BioSample names
File: hcm_cat_sra.vcf.gz.tbi
Description: tabix index file to assist searchability of the parent vcf with bcftools and other tools.
WGS data generation
A total of 1-2 mL of whole blood were collected from the cephalic, saphenous, or jugular vein into EDTA blood collection tubes. DNA was either isolated from whole blood or from buffy coats after whole blood centrifugation at 2000 rpm for 15 minutes. Genomic DNA isolation was performed using commercially available kits (Gentra Puregene Blood kit, QIAGEN, Hilden Germany; ArchivePure;5Prime) and by following the respective manufacturer’s protocol. High-quality unfragmented DNA was selected by a combination of 1% agarose gel visualization and spectrophotometric confirmation (a 260/280 ratio of ~1.8 and a concentration of > 50 ng/uL; NanoDrop One/One, Thermofisher, Waltham, GA, USA). Samples were stored at -20°C until ready for shipment to Theragen Bio Co., Ltd, Gyeonggi-do, Republic of Korea for WGS. Paired-end DNA libraries were generated with a TruSeq DNA Nano library prep kit. Samples were then pooled and sequenced at ~30x coverage on the Illumina NovaSeq6000 platform. Paired-end read lengths were either 125bp or 150 bp.
RNASeq data generation
Tissues from the LA, IVS and LVPW were extracted and either flash frozen in liquid nitrogen or placed in RNA later within 30 minutes post-mortem. All samples were subsequently stored at -80°C until ready for shipment for RNA-Seq. All tissue samples available were sent to Azenta, South Plainfield, NJ for RNA-Seq. Total RNA was extracted from frozen cell samples using Qiagen RNeasy Plus Universal mini kit and following the manufacturer’s instructions (Qiagen, Hilden Germany). The polyA method was used to select for mature mRNA and samples subsequently underwent quality control. RNA samples were quantified using Qubit 3.0 Fluorometer (Life Technologies, Carlsbad, CA, USA). RNA integrity was evaluated using Agilent TapeStation 4200 (Agilent Technologies, Palo Alto, CA, USA). Only samples with an RNA Integrity Number (RIN) of >7 were converted to cDNA and processed for stranded library prep. Paired-end 150 bp cDNA libraries were sequenced on the Illumina HiSeq platform at a read depth of ~50 million reads per sample.