Genes of the major histocompatibility complex (MHC) exhibit heterozygote advantage in immune defence, which in turn can select for MHC-disassortative mate choice. However, many species lack this expected pattern of MHC-disassortative mating. A possible explanation lies in evolutionary processes following gene duplication: if two duplicated MHC genes become functionally diverged from each other, offspring will inherit diverse multilocus genotypes even under random mating. We used locus-specific primers for high-throughput sequencing of two expressed MHC Class II B genes in Leach's storm-petrels, Oceanodroma leucorhoa, and found that exon 2 alleles fall into two gene-specific monophyletic clades. We tested for disassortative vs. random mating at these two functionally diverged Class II B genes, using multiple metrics and different subsets of exon 2 sequence data. With good statistical power, we consistently found random assortment of mates at MHC. Despite random mating, birds had MHC genotypes with functionally diverged alleles, averaging 13 amino acid differences in pairwise comparisons of exon 2 alleles within individuals. To test whether this high MHC diversity in individuals is driven by evolutionary divergence of the two duplicated genes, we built a phylogenetic permutation model. The model showed that genotypic diversity was strongly impacted by sequence divergence between the most common allele of each gene, with a smaller additional impact of monophyly of the two genes. Divergence of allele sequences between genes may have reduced the benefits of actively seeking MHC-dissimilar mates, in which case the evolutionary history of duplicated genes is shaping the adaptive landscape of sexual selection.
MHC allele sequences
24 MHC Class II B alleles found at Ocle-DAB1 and Ocle-DAB2 in this sample of 188 adults and 22 nestlings. Each allele is trimmed to exon 2 and is described with a DNA sequence, an amino acid translation, and a GenBank accession number. Four alleles have an in-frame 3-bp deletion; this is indicated by '---' in the DNA sequence, but the gap is closed in the amino acid sequence.
allele_sequences.txt
What genetic data are available for each bird
List of genetic data obtained for individuals in each of 94 nests. Columns are Year (year of nesting attempt: 2010 or 2013), Nest # (burrow number, unique within year but not between years), Adult Female Band # (leg band # of breeding female), Adult Male Band # (leg band # of breeding male), and then a series of Yes/No columns reporting whether genetic data are available for parents and for nestlings at MHC genes and at microsatellite loci.
data_list_by_nest.csv
Microsatellite genotypes
Genotypes of 188 adults and 34 nestlings at 15 microsatellite loci. File is in GenePop format. The file has 3 introductory rows: 1) metadata, 2) 15 locus names, 3) 'POP'. Next is a table of 222 rows x 16 tab-delimited columns. The first column of the table is the identity of the bird. Adults are identified by the last 5 digits of their Canadian Wildlife Service metal leg band. Nestlings were too young to band and thus are identified by their nest number and the suffix -Chk (for 'Chick'). The remaining 15 columns are microsatellite genotypes at 15 loci, with the loci ordered according to the names in the second row of the file. Genotypes are 6 digits: two 3-digit numbers, each corresponding to the size (in base pairs) of an allele. Missing genotypes are represented by a single zero.
microsatellite_genotypes.txt
MHC genotypes without Copy Number Variation
MHC genotypes of 188 adults + 22 chicks assuming no Copy Number Variation, in a table of 211 rows x 9 columns. Genotypes were determined from Illumina sequencing based on an assumption of no Copy Number Variation as described in Supporting Information lines 104-127. First row is header, and each remaining row corresponds to one bird. Columns are Year (year of nesting attempt: 2010 or 2013), Nest # (burrow number, unique within year but not between years), Age (Adult or Chick), Sex (determined by PCR), CWS Bird Band #, Ocle-DAB1*allele1, Ocle-DAB1*allele2, Ocle-DAB2*allele1, and Ocle-DAB2*allele2.
MHC_noCNV.csv
MHC genotypes when allowing Copy Number Variation
MHC genotypes of 188 adults + 22 chicks allowing Copy Number Variation, in a table of 211 rows x 12 columns. Genotypes were determined from Illumina sequencing based on an algorithm that permits Copy Number Variation as described in Supporting Information lines 129-145. First row is header, and each remaining row corresponds to one bird. Columns are Year (year of nesting attempt: 2010 or 2013), Nest # (burrow number, unique within year but not between years), Age (Adult or Chick), Sex (determined by PCR), CWS Bird Band #, Ocle-DAB1*allele1, Ocle-DAB1*allele2, Ocle-DAB1*allele3, Ocle-DAB2*allele1, and Ocle-DAB2*allele2, Ocle-DAB2*allele3, and Ocle-DAB2*allele4.
MHC_withCNV.csv
Microsoft Excel macro for mate choice randomization tests
Randomization tests were used to test whether mean and variance in MHC similarity of actual mates were significantly different from random. These tests were run in Microsoft Excel using a macro to create 10,000 iterations of randomly pairing each of the 94 females with one male (without replacement), saving the average and variance of the MHC similarity between random mates for each iteration. This file is an example of one such Excel file -- in this case, the one that analyzes average p-distance between MHC alleles of mates using all 89 amino acids of exon 2, the results of which are shown in the graph in Figure S4a in Supplemental Information. Similar analyses were conducted for other metrics of MHC similarity and for microsatellite-based estimates of relatedness coefficients between mates. SEE ReadMe FILE FOR DETAILS OF FILE CONTENTS.
Excel_macro.xls
Permutation Model - set up arrays
The permutation of the phylogeny was conducted with a set of scripts written and run in 4th Dimension (4D Inc., San Jose, CA). This first script takes the imported data in Individuals, Distances, Alleles and puts them into arrays.
DD_BS_Setup_Reference_Arrays.txt
Permutation Model - simulation core
The permutation of the phylogeny was conducted with a set of scripts written and run in 4th Dimension (4D Inc., San Jose, CA). This second script is the simulation core. It makes N number of trial records composed of randomized distribution of alleles in the phylogenetic framework.
DD_BS_DistributeAlleles.txt
Permutation Model - rare case simulation
The permutation of the phylogeny was conducted with a set of scripts written and run in 4th Dimension (4D Inc., San Jose, CA). This third script is the same as , but modified to handle the rarity of very low or high monophyly.
DD_BS_Distribute_Special.txt
Permutation Model - calculate within an individual
The permutation of the phylogeny was conducted with a set of scripts written and run in 4th Dimension (4D Inc., San Jose, CA). This fourth script looks up pairwise distances between alleles and then calculates the average distance between 4 alleles within an individual.
DD_Rtn_Mean_Dist.txt
Permutation Model - calculate means across individuals
The permutation of the phylogeny was conducted with a set of scripts written and run in 4th Dimension (4D Inc., San Jose, CA). Once the trials are made, this fifth and final script calculates the mean of mean distances between alleles within individuals in population.
DD_BS_Calc_means_from_trial.txt