Skip to main content
Dryad

Fitness consequences of structural variation inferred from a House Finch pangenome

Data files

Nov 07, 2024 version files 64.89 GB

Select up to 11 GB of files for download

Abstract

Genomic structural variants (SVs) play a crucial role in adaptive evolution, yet their average fitness effects and characterization with pangenome tools are understudied in wild animal populations. We constructed a pangenome for House Finches, a model for studies of host-pathogen coevolution, using long-read sequence data on 16 individuals (32 de novo-assembled haplotypes) and one outgroup. We identified 643,207 SVs larger than 50 base pairs, mostly (60%) involving repetitive elements, with reduced SV diversity in the eastern US as a result of its introduction by humans. The distribution of fitness effects of genome-wide SVs was estimated using maximum likelihood approaches and showed SVs in both coding and non-coding regions to be on average more deleterious than smaller indels or single nucleotide polymorphisms. Our reference-free pangenome facilitated the discovery of a 10-million-year-old, 11-megabase-long pericentric inversion, whose genotype frequency increased steadily over the 25 years since House Finches were first exposed to the bacterial pathogen Mycoplasma gallispecticum and which showed signatures of balancing selection, capturing genes related to immunity and telomerase activity. We also observed shorter telomeres in populations with a greater number of years of exposure to Mycoplasma. Our study illustrates the utility of applying pangenome methods to wild animal populations, helps estimate the fitness effects of genome-wide SVs, and advances our understanding of adaptive evolution through structural variation.