Skip to main content

Archived data for: Balancing selection, genetic drift, and human mediated-introgression interplay to shape MHC (functional) diversity in Mediterranean brown trout

Cite this dataset

Talarico, Lorenzo et al. (2022). Archived data for: Balancing selection, genetic drift, and human mediated-introgression interplay to shape MHC (functional) diversity in Mediterranean brown trout [Dataset]. Dryad.


The extraordinary polymorphism of Major Histocompatibility Complex (MHC) genes is considered a paradigm of pathogen-mediated balancing selection, although empirical evidence is still scarce. Furthermore, the relative contribution of balancing selection to shape MHC population structure and diversity, compared to that of neutral forces, as well as its interaction with other evolutionary processes such as hybridization, remains largely unclear. To investigate these issues, we analysed adaptive (MHC-DAB gene) and neutral (11 microsatellite loci) variation in 156 brown trout (Salmo trutta complex) from six wild populations in central Italy exposed to introgression from domestic hatchery lineages (assessed with the LDH gene). MHC diversity and structuring correlated with those at microsatellites, indicating the substantial role of neutral forces. However, individuals carrying locally rare MHC alleles/supertypes (regardless of the zygosity status and degree of sequence dissimilarity of MHC) were in better body condition (a proxy of individual fitness/parasite load), hence supporting balancing selection under rare allele advantage, but not heterozygote advantage or divergent allele advantage. The association between specific MHC supertypes and body condition confirmed in part this finding. Across populations, MHC allelic richness increased with increasing admixture between native and domestic lineages, indicating introgression as a source of MHC variation. Furthermore, introgression across populations appeared more pronounced for MHC than microsatellites, possibly because initially-rare MHC variants are expected to introgress more readily under rare allele advantage. Providing evidence for the complex interplay among neutral evolutionary forces, balancing selection and human-mediated introgression in shaping the pattern of MHC (functional) variation, our findings contribute to a deeper understanding of the evolution of MHC genes in wild populations exposed to anthropogenic disturbance.

Usage notes

1. Filename: Brown_trout_MHC_fastq_paired” (folder)

Paired-end Fastq files (*.R1 and *.R2) from Illumina MiSeq 2x300 bp run: 15 batches, each contains barcoded MHC-DAB amplicons of 6-19 Mediterranean brown trout individuals as specified in “Brown_trout_amplicon_data-batches.txt”)

2. Filename: “Brown_trout_amplicon_data-batches.txt

Individual dual-tag and batch membership of overall 156 Mediterranean brown trout samples. Each batch (overall 15) contains from 6 to 19 barcoded amplicons.

3. Filename: MHC-DAB_genotypes_AmpliSAS.xlsx

The AmpliSAS output file containing individual MHC-DAB alleles and their coverage for 156 Mediterranean brown trout samples. The following information is provided for each individual: the overall number of reads (DEPTH_AMPLICON); the overall number of allele reads (DEPTH_ALLELES); the number of alleles (COUNT_ALLELE); the number of reads for each allele. The following information is provided for each of 58 MHC-DAB alleles: the nucleotide sequence (SEQUENCE); the sequence length (LENGTH); the overall number of reads (DEPTH) the number of individuals carrying that allele (SAMPLES); allele code (ALLELE); the average within-individual reads (MEAN_FREQ); the maximum (MAX_FREQ) and minimum (MIN_FREQ) frequency of reads observed within an individual.

4. Filename: “MHC&STR_genotypes.xlsx

Individual genotypes of 11 microsatellite loci, the MHC-DAB, and MHC-ST (supertypes). Data are in GenAlEx 6.5 format. Note that MHC-DAB*c, *e, *g, *k and *l alleles are coded as 501, 502, 503, 504 and 505 respectively.

5. Filename: “phenotypic&genetic_data.csv

Phenotypic and genetic data of 156 Mediterranean brown trout used for testing the association between MHC and body condition. The following individual data are provided: name (Sample); origin (Pop); Fulton’s body condition factor (BC); LDH-C1 genotype (LDH: native = 100/100, hybrid = 90/100, domestic = 90/90); mean heterozygosity computed over 11 microsatellite loci (H); genetic dissimilarity between MHC alleles computed as Kimura 2-parameter nucleotide distance (K2P_dist) and Poisson-corrected amino acid distance (Poi_dist); MHC-DAB genotype (DAB1 and DAB2); genotype based on MHC-supertypes (ST1 and ST2); within-population frequency of MHC-DAB allele 1 and 2 (freq_DAB1 and freq_DAB2); within-population frequency of supertype 1 and 2 (freq_ST1 and freq_ST2). Note that MHC-DAB*c, *e, *g, *k and *l alleles are coded as 501, 502, 503, 504 and 505 respectively.

6. Filename: “Randomize_supertypes.R

R-script to iteratively assign MHC alleles to supertypes at random and calculate per-population differentiation (Nei’s GST, Hedrick’s GST and Jost’s D) following the randomization approach described in Lighten et al. (2017). The script also generates a plot of observed versus simulated population differentiation, along with Holm-Bonferroni corrected p-values. Note that you may run the randomization analysis with our published MHC data by using the “phenotypic&genetic_data.csv” as input file.

7. Filename: “Appendices_R1_final.docx

Manuscript’s supplementary information: 10 Appendices