Data from: Large-scale genotyping of highly polymorphic loci by next generation sequencing: how to overcome the challenges to reliably genotype individuals?
Ferrandiz-Rovira, Mariona et al. (2015), Data from: Large-scale genotyping of highly polymorphic loci by next generation sequencing: how to overcome the challenges to reliably genotype individuals?, Dryad, Dataset, https://doi.org/10.5061/dryad.rp7n9
Studying the different roles of adaptive genes is still a challenge in evolutionary ecology and requires reliable genotyping of large numbers of individuals. Next-generation sequencing (NGS) techniques enable such large-scale sequencing, but stringent data processing is required. Here, we develop an easy to use methodology to process amplicon-based NGS data and we apply this methodology to reliably genotype four major histocompatibility complex (MHC) loci belonging to MHC class I and II of Alpine marmots (Marmota marmota). Our post-processing methodology allowed us to increase the number of retained reads. The quality of genotype assignment was further assessed using three independent validation procedures. A total of 3069 high-quality MHC genotypes were obtained at four MHC loci for 863 Alpine marmots with a genotype assignment error rate estimated as 0.21%. The proposed methodology could be applied to any genetic system and any organism, except when extensive copy-number variation occurs (that is, genes with a variable number of copies in the genotype of an individual). Our results highlight the potential of amplicon-based NGS techniques combined with adequate post-processing to obtain the large-scale highly reliable genotypes needed to understand the evolution of highly polymorphic functional genes.