Data from: Identifying targets of selection in mosaic genomes with machine learning: applications in Anopheles gambiae for detecting sites within locally adapted chromosomal inversions
He, Qixin, University of Michigan-Ann Arbor
Knowles, L. Lacey, University of Michigan-Ann Arbor
Published Mar 15, 2016 on Dryad.
Cite this dataset
He, Qixin; Knowles, L. Lacey (2016). Data from: Identifying targets of selection in mosaic genomes with machine learning: applications in Anopheles gambiae for detecting sites within locally adapted chromosomal inversions [Dataset]. Dryad. https://doi.org/10.5061/dryad.h0649
Chromosomal inversions are important structural changes that may facilitate divergent selection when they capture co-adaptive loci in the face of gene flow. However, identifying selection targets within inversions can be challenging. The high degrees of differentiation between heterokaryotypes, as well as the differences in demographic histories of collinear regions compared with inverted ones, reduce the power of traditional outlier analyses for detecting selected loci. Here, we develop a new approach that uses discriminant functions informed from inversion-specific expectations to classify loci that are under selection (or drift). Analysis of RAD sequencing data we collected in a classic dipteran species with polymorphic inversion clines-Anopheles gambiae, a malaria vector species from sub-Saharan Africa-demonstrates the benefits of the approach compared with traditional outlier analyses. We focus specifically on two polymorphic inversions, the 2La and 2Rb arrangements that predominate in dry habitats and the 2L+(a) and 2R+(b) arrangements in wet habitats, which contrast with the minimal geographic structure of SNPs from collinear regions. With our approach, we identify two strongly selected regions within 2La associated with dry habitat. Moreover, we also show that the prevalence of selection is greater in the arrangement 2L+(a) that is associated with wet habitat (unlike presumed importance of selective divergence associated with the shift of the mosquitoes into dry habitats). We discuss the implications of these results with respect to studies of rapid adaptation in these malaria vectors, and in particular, the insights our newly developed approach offers for identifying not only potential targets of selection, but also the population that has undergone adaptive change.
bootstrap ranges of parameters for 2La, 2Rb regions and msms simulation settings file
All individuals vcf
VCF file of SNPs of all individuals.
Sampling location, species, karyotypes of all
fastsimcoal2 input files of different scenarios.
Region statistics of 2La and 2Rb and posterior probability of predicted scenarios.