We considered genome-wide four-fold degenerate sites from an African Drosophila melanogaster population and compared them to short introns. To include divergence and to polarize the data, we used its close relatives Drosophila simulans, Drosophila sechellia, Drosophila erecta and Drosophila yakuba as outgroups. In D. melanogaster, the GC content at four-fold degenerate sites is higher than in short introns; compared to its relatives, more AT than GC is fixed. The former has been explained by codon usage bias (CUB) favouring GC; the latter by decreased intensity of directional selection or by increased mutation bias towards AT. With a biallelic equilibrium model, evidence for directional selection comes mostly from the GC-rich ancestral base composition. Together with a slight mutation bias, it leads to an asymmetry of the unpolarized allele frequency spectrum, from which directional selection is inferred. Using a quasi-equilibrium model and polarized spectra, however, only purifying and no directional selection is detected. Furthermore, polarized spectra are proportional to those of the presumably unselected short introns. As we have no evidence for a decrease in effective population size, relaxed CUB must be due to a reduction in the selection coefficient. Going beyond the biallelic model and considering all four bases, signs of directional selection are stronger. In contrast to short introns, complementary bases show strand specificity and allele frequency spectra depend on mutation directions. Hence, the traditional biallelic model to describe the evolution of four-fold degenerate sites should be replaced by more complex models assuming only quasi-equilibrium and accounting for all four bases.
Sequence alignment of fourfold degenerate sites of the 2L chromosome arm
We analyzed genome-wide fourfold degenerate sites from an African (Malawi) D. melanogaster population (Release 1.0), provided by the Drosophila Population Genomics Project (http://www.dpgp.org/; Langley et al., 2012). To obtain outgroup sequences, we downloaded (http://genome.ucsc.edu/ ) aligned single genome-wide sequences of D. simulans, D. sechellia, D. erecta and D. yakuba (Begun et al., 2007; Clark et al., 2007) (Release 5), and combined them with the D. melanogaster sequences for all autosomes. We wrote Python and R scripts to extract the data according to the annotation of the D. melanogaster genome reference file (Release 5.31) from Flybase and to perform the analyses. We compared the data of fourfold degenerate sites to short introns (bases 8 to 30 of introns < 66 bp) from the same dataset (doi:10.5061/dryad.t201q). A detailed description of short introns can be seen in Clemente and Vogl (2012).
alignment_2L
Sequence alignment of fourfold degenerate sites of the 2R chromosome arm
We analyzed genome-wide fourfold degenerate sites from an African (Malawi) D. melanogaster population (Release 1.0), provided by the Drosophila Population Genomics Project (http://www.dpgp.org/; Langley et al., 2012). To obtain outgroup sequences, we downloaded (http://genome.ucsc.edu/ ) aligned single genome-wide sequences of D. simulans, D. sechellia, D. erecta and D. yakuba (Begun et al., 2007; Clark et al., 2007) (Release 5), and combined them with the D. melanogaster sequences for all autosomes. We wrote Python and R scripts to extract the data according to the annotation of the D. melanogaster genome reference file (Release 5.31) from Flybase and to perform the analyses. We compared the data of fourfold degenerate sites to short introns (bases 8 to 30 of introns < 66 bp) from the same dataset (doi:10.5061/dryad.t201q). A detailed description of short introns can be seen in Clemente and Vogl (2012).
alignment_2R
Sequence alignment of fourfold degenerate sites of the 3L chromosome arm
We analyzed genome-wide fourfold degenerate sites from an African (Malawi) D. melanogaster population (Release 1.0), provided by the Drosophila Population Genomics Project (http://www.dpgp.org/; Langley et al., 2012). To obtain outgroup sequences, we downloaded (http://genome.ucsc.edu/ ) aligned single genome-wide sequences of D. simulans, D. sechellia, D. erecta and D. yakuba (Begun et al., 2007; Clark et al., 2007) (Release 5), and combined them with the D. melanogaster sequences for all autosomes. We wrote Python and R scripts to extract the data according to the annotation of the D. melanogaster genome reference file (Release 5.31) from Flybase and to perform the analyses. We compared the data of fourfold degenerate sites to short introns (bases 8 to 30 of introns < 66 bp) from the same dataset (doi:10.5061/dryad.t201q). A detailed description of short introns can be seen in Clemente and Vogl (2012).
alignment_3L
Sequence alignment of fourfold degenerate sites of the 3R chromosome arm
We analyzed genome-wide fourfold degenerate sites from an African (Malawi) D. melanogaster population (Release 1.0), provided by the Drosophila Population Genomics Project (http://www.dpgp.org/; Langley et al., 2012). To obtain outgroup sequences, we downloaded (http://genome.ucsc.edu/ ) aligned single genome-wide sequences of D. simulans, D. sechellia, D. erecta and D. yakuba (Begun et al., 2007; Clark et al., 2007) (Release 5), and combined them with the D. melanogaster sequences for all autosomes. We wrote Python and R scripts to extract the data according to the annotation of the D. melanogaster genome reference file (Release 5.31) from Flybase and to perform the analyses. We compared the data of fourfold degenerate sites to short introns (bases 8 to 30 of introns < 66 bp) from the same dataset (doi:10.5061/dryad.t201q). A detailed description of short introns can be seen in Clemente and Vogl (2012).
alignment_3R