Skip to main content
Dryad

Data from: Improving the resolution of canine genome-wide association studies using genotype imputation: a study of two breeds

Data files

May 17, 2021 version files 60.72 MB

Abstract

Genotype imputation using a reference panel that combines high-density array data and publicly available whole genome sequence consortium variant data is potentially a cost-effective method to increase the density of extant lower-density array datasets. In this study three datasets (two Border Collie; one Italian Spinone) generated using a legacy array (Illumina CanineHD, 173,662 SNPs) were utilised to assess the feasibility and accuracy of this approach and to gather additional evidence for the efficacy of canine genotype imputation. The cosmopolitan reference panels used to impute genotypes comprised dogs of 158 breeds, mixed breed dogs, wolves, and Chinese indigenous dogs as well as breed-specific individuals genotyped using the Axiom Canine HD array. The two Border Collie reference panels comprised 808 individuals including 79 Border Collies and 426,326 or 426,332 SNPs; and the Italian Spinone reference panel comprised 807 individuals including 38 Italian Spinoni and 476,313 SNPs. A high accuracy for imputation was observed, with the lowest accuracy observed for one of the Border Collie datasets (mean R2 = 0.94) and the highest for the Italian Spinone dataset (mean R2 = 0.97). This study’s findings demonstrate that imputation of a legacy array study set using a reference panel comprising both breed-specific array data and multi-breed variant data derived from whole genomes is effective and accurate.  The process of canine genotype imputation, using the valuable growing resource of publicly available canine genome variant datasets alongside breed-specific data, is described in detail to facilitate and encourage use of this technique in canine genetics.