Data from: Unconstrained evolution in short introns?—An analysis of genome-wide polymorphism and divergence data from Drosophila
Clemente, Florian; Vogl, Claus (2012), Data from: Unconstrained evolution in short introns?—An analysis of genome-wide polymorphism and divergence data from Drosophila, Dryad, Dataset, https://doi.org/10.5061/dryad.t201q
An unconstrained reference sequence facilitates the detection of selection. In Drosophila, sequence variation in short introns seems to be least influenced by selection and dominated by mutation and drift. Here, we test this with genome-wide sequences using an African population (Malawi) of D. melanogaster and data from the related outgroup species D. simulans, D. sechellia, D. erecta, and D. yakuba. The distribution of mutations deviates from equilibrium and the content of A and T (AT) nucleotides shows an excess of variance among introns. We explain this by a complex mutational pattern: a shift in mutational bias towards AT, leading to a slight non-equilibrium in base composition, and context-dependent mutation rates, with GC-sites mutating most frequently in AT-rich introns. By comparing the corresponding allele frequency spectra of AT-rich versus GC-rich introns, we can rule out the influence of directional selection or biased gene conversion (BGC) on the mutational pattern. Compared to neutral equilibrium expectations, polymorphism spectra show an excess of low frequency and a paucity of intermediate frequency variants, irrespective of the direction of mutation. Combining the information from different outgroups with the polymorphism data and using a generalized linear model, we find evidence for shared ancestral polymorphism between D. melanogaster and D. simulans/D. sechellia, arguing against a bottleneck in D. melanogaster. Generally, we find that short introns can be used as a neutral reference on a genome-wide level, if the spatially and temporally varying mutational pattern is accounted for.