Data from: Evolutionary variation in gene conversion at the avian MHC is explained by fluctuating selection, gene copy numbers, and life history
Data files
Jun 23, 2024 version files 3.40 MB
-
MHC_I_seq.fasta
2.04 MB
-
MHC_II_seq.fasta
1.36 MB
-
README.md
847 B
Abstract
The Major Histocompatibility Complex (MHC) multigene family encodes key pathogen-recognition molecules of the vertebrate adaptive immune system. Hyper-polymorphism of MHC genes is de novo generated by point mutations, but new haplotypes may also arise by re-shuffling of existing variation through intra- and inter-locus gene conversion. Although the occurrence of gene conversion at the MHC has been known for decades, we still have limited understanding of its functional importance. Here, I took advantage of extensive genetic resources (~9000 sequences) to investigate a broad scale macroevolutionary patterns in gene conversion processes at the MHC across nearly 200 avian species. Gene conversion was found to constitute a universal mechanism in birds, as 83% of species showed footprints of gene conversion at either MHC class and 25% of all allelic variants were attributed to gene conversion. Gene conversion processes were stronger at MHC-II than MHC-I, but inter-specific variation at both MHC classes was explained by similar evolutionary scenarios, reflecting fluctuating selection towards different optima and drift. Gene conversion showed uneven phylogenetic distribution across birds and was driven by gene copy number variation, supporting significant role of inter-locus gene conversion processes in the evolution of the avian MHC. Finally, MHC gene conversion was stronger in species with fast life histories (high fecundity) and in long-distance migrants, likely reflecting variation in population sizes and host-pathogen coevolutionary dynamics. The results provide a robust comparative framework for understanding macroevolutionary variation in gene conversion at the avian MHC and reinforce important contribution of this mechanism to functional MHC diversity.
Description of the data and file structure
The dataset contains two fasta files with avian Major Histocompatibility Complex (MHC) sequences:
- MHC_I_seq.fasta - MHC class I exon 3 sequences (n = 5373)
- MHC_II_seq.fasta - MHC class II exon 2 sequences (n = 3574)
The sequences were used to infer species-specific gene conversion signals at the avian MHC genes.
All sequences were retrieved from the GenBank NCBI database. GenBank number and species are provided for each sequence
Code/Software
Minias_et_al_code.txt file contains R code used for the phylogenetically-informed comparative analyses of gene conversion signal at the avian Major Histocompatibility (MHC) genes.