Data for: Global frequency analyses of canine progressive rod-cone degeneration–progressive retinal atrophy and collie eye anomaly using commercial genetic testing data
Cite this dataset
Clark, Jessica A. et al. (2023). Data for: Global frequency analyses of canine progressive rod-cone degeneration–progressive retinal atrophy and collie eye anomaly using commercial genetic testing data [Dataset]. Dryad. https://doi.org/10.5061/dryad.6djh9w17c
Abstract
Hundreds of genetic variants associated with canine traits and disorders have been identified, with commercial tests offered. However, the geographic distributions and changes in allele and genotype frequencies over prolonged, continuous periods of time are lacking. This study utilized a large set of genotypes from dogs tested for the progressive rod-cone degeneration–progressive retinal atrophy (prcd-PRA) G>A missense PRCD variant (n = 86,667) and the collie eye anomaly (CEA)-associated NHEJ1 deletion (n = 33,834) provided by the commercial genetic testing company (Optigen/Wisdom Panel, Mars Petcare Science & Diagnostics). These data were analyzed using the chi-square goodness-of-fit test, time-trend graphical analysis, and regression modeling in order to evaluate how test results changed over time. The results span fifteen years, representing 82 countries and 67 breeds/breed mixes. Both diseases exhibited significant differences in genotype frequencies (p = 2.7 × 10−152 for prcd-PRA and 0.023 for CEA) with opposing graphical trends. Regression modeling showed time progression to significantly affect the odds of a dog being homozygous or heterozygous for either disease, as do variables including breed and breed popularity. This study shows that genetic testing informed breeding decisions to produce fewer affected dogs. However, the presence of dogs homozygous for the disease variant, especially for prcd-PRA, was still observed fourteen years after test availability, potentially due to crosses of unknown carriers. This suggests that genetic testing of dog populations should continue.
README: Genotype and sample data from Clark et al. "Global Frequency Analyses of Canine Progressive Rod-Cone Degeneration–Progressive Retinal Atrophy and Collie Eye Anomaly Using Commercial Genetic Testing Data"
https://doi.org/10.5061/dryad.6djh9w17c
Description of the data and file structure
Genotype and metainformation for domestic dog samples that were owner-submitted to commercial genetic testing companies OptiGen, LLC and Mars Petcare between the years of 2004-2019.
In total, there were 86,667 samples tested for Progressive Rod-Cone Degeneration-Progressive Retinal Atrophy (prcd-PRA) and 33,834 samples tested for Collie Eye Anomaly (CEA), both diseases prevalent across different dog breeds and with an effect on canine eyesight.
Data file: Clark2023_Dataset.xlsx
Data is provided as an .xlsx file with two spreadsheets of similar arrangement - one for CEA and one for prcd-PRA samples. Each row on the table represents anonymized samples while columns represent variables:
*Sample: number assigned to distinguish samples anonymously
*Year: year in which owner submitted canine DNA sample to company for analysis
*Breed: owner-reported sample dog breed, mix of breeds, or generalized "mixed breed"
*Genotype: number of disease-associated variants present in DNA sample
-Normal = Homozygous wild-type (0 copies of disease variant present in sample)
-Carrier = Heterozygous (1 copy of disease variant present in sample)
-Affected = Homozygous for associated allele (2 copies of disease variant present in sample)
*OwnerCountry: owner-reported country of sample origin
*Continent: continent of sample origin; determined from OwnerCountry
*AKC: breed classification of sample according to the American Kennel Club (AKC); determined from Breed
*FCI: breed classification of sample according to the Federation Cynologique Internationale (FCI); determined from Breed
*Clade: grouping of breeds according to genetic relationship that sample belongs to; determined from Breed
*Rank2005: (CEA dataset only) popularity ranking of sample breed according to AKC registration counts in the year 2005 (first year of data within the CEA dataset); determined from Breed
*Rank2004: (prcd-PRA dataset only) popularity ranking of sample breed according to AKC registration counts in the year 2004 (first year of data within the prcd-PRA dataset); determined from Breed
*Rank2019: popularity ranking of sample breed according to AKC registration counts in the year 2019 (last year of data within either the CEA or prcd-PRA dataset); determined from Breed
Any value of "0" represents missing information where Breed value was not associated with a known AKC, FCI, Clade, Rank2005, Rank2004, Rank2019 value.
Methods
Data for this study comprised genotype status for prcd-PRA (n=86,667 dogs) and CEA (n=33,834 dogs) tested by a commercial genetic testing company (OptiGen, LLC acquired by Mars Petcare in 2018) between the years 2004-2019. These data underwent statistical analysis (Chi-square goodness-of-fit, logistic regression modeling, multinomial regression modeling) to determine significance of changing allele and genotype frequencies over time as well as the impact of dog breed, geography, progress of time, and popularity on overall likelihood of a dog developing either disease or having a carrier status.
Funding
National Institutes of Health, Award: K01-OD027051