Data for: MHC I of the great reed warbler promotes a flat peptide binding mode
Data files
Jul 07, 2025 version files 16.72 MB
-
387_Acar_MHCI_ex3.translated.fasta
39.13 KB
-
52_Gallus_gallus_BF2_ex3.fasta
4.96 KB
-
85_Gallus_gallus_BF2_ex2-4.fasta
27.28 KB
-
Data_analyses.R
5.26 KB
-
Individual_allele_counts.csv
2.55 KB
-
Individual_MHCI_alleles.zip
168.65 KB
-
MCMCglmm_models.RData
716.90 KB
-
README.md
5.26 KB
-
trees.nex
15.75 MB
Abstract
The major histocompatibility complex (MHC) plays a key role in pathogen recognition as part of the adaptive immune system. MHC I gene copy numbers in birds of the order Passeriformes (songbirds) are substantially larger compared to other birds. MHC I diversity and antigen presentation have been carefully characterized in chicken Gallus gallus of the order Galliformes; chickens express few MHC I genes and often present antigens that bulge out of the peptide binding cleft. This observation raises the question whether MHC I present antigens in a similar way in species with many MHC genes? Here, we present the X-ray structure of MHC I from the great reed warbler Acrocephalus arundinaceus (Acar3) a long-distance migratory songbird. Structural analysis shows that MHC I bind the antigen in a flat conformation due to a sequentially well-conserved restriction point, acting like a pair of tweezers, within the peptide binding grove, created by Arg97 and Arg155. This more stringent antigen presentation by Acar MHC I molecules may partly explain the high MHC gene copy numbers seen in the great reed warbler.
Dataset DOI: 10.5061/dryad.cvdncjtg1
Description of the data and file structure
This dataset accompanies the manuscript “MHC I of the great reed warbler promotes a flat peptide binding mode” by Venskutonytė et al. submitted to the journal Immunology for review in June 2025. The dataset contains alignment files of avian MHC I exon 3 sequences that were used in analyses within the manuscript as well as an R script and associated data used to perform the phylogenetic comparative analyses described within the manuscript.
Usage notes
The following is a list of file types contained within this repository, with brief descriptions of how to work with them:
.R: R scripts that can be viewed, edited and run using the statistical software R (https://www.r-project.org/). R will run on Windows, MacOS, and a wide variety of UNIX platforms.
.RData: RData files can be opened in R and contain objects from an R session.
.csv: Plain text files that use symbol delimiter (in this case semicolons) to separate values and new lines to separate newlines. These files can be viewed and edited with any plain text editor (e.g., Linux less command, Nano, Vim, Text Editor). They can also be opened in Excel by specifying the delimiter symbol (in this case semicolons).
.fasta: Plain text DNA sequence information in FASTA format. These files can be viewed and edited in any of the many free sequence editing programs (e.g., BioEdit, AliView, Jalview) but also any plain text editor (e.g., Linux less command, Nano, Vim, Text Editor).
.nex: Nexus files store phylogenetic tree data in plain text format (https://evomics.org/resources/tree-formats/). These files can be viewed and edited with any plain text editor (e.g., Linux less command, Nano, Vim, Text Editor). They can also be read into software used to create visual representations of phylogenetic trees such as FigTree or iTOL.
.zip: A zipped (compressed to reduce storage space) folder that can be unzipped on MacOS by double clicking or right clicking and selecting ‘Extract All’ on a Windows computer.
Files and variables
File: Data_analyses.R
Description: R script for all phylogenetic comparative analyses conducted in the study (uses “Individual_allele_counts.csv” & “trees.nex” as input).
File: 85_Gallus_gallus_BF2_ex2-4.fasta
Description: Amino acid alignment of 85 unique chicken Gallus gallus MHC I exon 2-4 sequences.
File: 387_Acar_MHCI_ex3.translated.fasta
Description: Amino acid alignment of 387 unique great reed warbler Acrocephalus arundinaceus MHC I exon 3 sequences.
File: Individual_allele_counts.csv
Description: Data frame of the total number of MHC I alleles and the number of MHC I alleles that contain Arg55 for all 81 individuals across 32 songbird species. Information from ‘Individual_MHCI_alleles.zip’ was used to compile this data frame.
Variables
- Species: species from which the MHC I was sequenced
- Sample: name of individual
- Nr_Alleles: number of MHC I alleles per individual (determined from examining alignments in 'Individual_MHCI_alleles.zip').
- Nr_R155_Alleles: number of MHC I alleles with an arginine present at position 155 (determined from examining alignments in 'Individual_MHCI_alleles.zip').
File: 52_Gallus_gallus_BF2_ex3.fasta
Description: Amino acid alignment of 52 unique chicken Gallus gallus MHC I exon 3 sequences.
File: Individual_MHCI_alleles.zip
Description: Compressed folder containing 81 fasta files of aligned MHC I partial exon 3 sequences corresponding to individuals from 32 songbird species. See ‘Individual_allele_counts.csv’ to match file names to individuals and species.
File: MCMCglmm_models.RData
Description: RData object containing all output of analyses described in R script (Data_analyses.R).
File: trees.nex
Description: 10000 trees from the posterior distribution of Jetz et al (2012) for the 32 species in the study. These are used within the R script (Data_analyses.R) to make a maximum clade credibility tree.
Code/software
The script ‘Data_analyses.R’ can be run in R using the input files “Individual_allele_counts.csv” & “trees.nex” to recreate the phylogenetic comparative analyses described in the manuscript. This script was run in R version 4.4.0 (2024-04-24) -- "Puppy Cup”. The versions of the packages used are as follows: phangorn_2.12.1; MCMCglmm_2.36; ape_5.8; dplyr_1.1.4.
Access information
The data in ‘Individual_MHCI_alleles.zip’ was derived from O’Connor et al 2018, ‘The evolution of immunity in relation to colonization and migration’, Nature Ecology and Evolution, 2: 841-849. Sequences with GenBank accession codes MF477947–MF478976.
The data in ‘trees.nex’ was derived from Jetz et al. 2012, ‘The global diversity of birds in space and time’, Nature 491: 444–448. It can be publicly accessed using the ‘Phylogeny Subsets’ tool at https://birdtree.org/