Skip to main content
Dryad

Data from: Image feature embedding with a deep learning framework improves genome-wide association studies on dog endophenotypes

Data files

May 14, 2026 version files 27.46 MB

Click names to download individual files

Abstract

Domestic dogs exhibit remarkable morphological diversity, making quantitative characterization of their phenotypes challenging. Traditional phenotyping methods often rely on manual measurements, which are limited in their ability to capture complex visual traits. Deep learning provides a new opportunity to automatically extract informative and biologically meaningful features from images. In this study, we constructed a dataset of 13,254 dog images across multiple breeds and employed ResNet and ViT models to automatically extract 256-dimensional image embeddings. After dimensionality reduction using UMAP, we performed a GWAS on the extracted features and breed-level genotype data. We identified 15 genes previously reported to be associated with dog traits such as hair length and body size, as well as novel candidate genes related to body development and hair growth, including EIF2S2, TRHR, and TCF25, which harbor variants with potential functional relevance. This approach is validated by known genetic associations and can reveal new genotype-phenotype links. Building on these capabilities, this approach provides a scalable framework for phenotype extraction that enables population genetic studies in domestic dogs and can facilitate breeding in other economically important species.