Skip to main content

Genome data for avian MHC

Cite this dataset

He, Ke (2021). Genome data for avian MHC [Dataset]. Dryad.


Long-read sequencing technology has the potential to greatly improve our understanding of the Major Histocompatibility Complex (MHC), which is a major component of the adaptive immune system, but it has been difficult to study because of highly repetitive regions. In this study we used genomes based on long-read sequencing to predict the number and location of MHC loci in a broad range of bird taxa. We also examined species that had not previously been studied at the MHC but had a published genome. Overall, in the long-read based genomes of 28 species, we found that there was extremely large variation in the number of MHC loci in passerines, particularly in manakins (up to 193 class II loci), which had previously not been studied at the MHC. Overall, there were greater numbers of both class I and II loci in passerines than non-passerines. Our results provide the first direct evidence from passerine genomes of this high level of duplication. We also found different duplication patterns; i.e., in some species both MHC class I and II genes were duplicated together, while in most species they were duplicated separately. Our study shows that long-read sequencing can provide dramatic increases in our knowledge of MHC structure, although further improvements in chromosome level assembly are needed to study the mechanisms producing these high levels of duplication.