Meiotic recombination is crucial for chromosomal segregation, and facilitates the spread of beneficial and removal of deleterious mutations. Recombination rates frequently vary along chromosomes and Drosophila melanogaster exhibits a remarkable pattern. Recombination rates gradually decrease towards centromeres and telomeres, with a dramatic impact on levels of variation in natural populations. Two close sister species, D. simulans and D. mauritiana do not only have higher recombination rates, but also exhibit a much more homogeneous recombination rate that only drops sharply very close to centromeres and telomeres. Because certain sequence motifs are associated with recombination rate variation in D. melanogaster, we tested whether the difference in recombination landscape between D. melanogaster and D. simulans can be explained by the genomic distribution of recombination-rate associated sequence motifs. We constructed the first high-resolution recombination map for D. simulans based on 189 haplotypes from a natural D. simulans population, and searched for short sequence motifs linked with higher than average recombination in both sister species. We identified five consensus motifs significantly associated with higher than average chromosome-wide recombination rates in at least one species and present in both. Testing fine resolution associations between motif density and recombination, we found strong and positive associations genome-wide over a range of scales in D. melanogaster, while the results were equivocal in D. simulans. Despite the strong association in D. melanogaster, we did not find a decreasing density of these short-repeat motifs towards centromeres and telomeres. We conclude that the density of recombination-associated repeat motifs cannot explain the large-scale recombination landscape in D. melanogaster, nor the differences to D. simulans. The strong association seen for the sequence motifs in D. melanogaster likely reflects their impact influencing local differences in recombination rates along the genome.
1_recombination_maps_csv_ZIP
Recombination map files for D. simulans and D. melanogaster, at different smoothing resolutions. We provide moving median and LOESS smoothed maps. For Dsim, the moving median files are provided at the following resolutions: 5, 25, 101, 501, and 2501 kb. For Dmel, the moving median files are provided at the following resolutions: 501, and 2501 kb, due to a lower resolution initial map. For Dsim the LOESS files are provided with the LOESS smoothing parameter set at the following levels: 0.001, 0.005, 0.02, 0.10. These levels are equivalent to moving median windows of 25, 101, 501, and 2501 kb. For Dmel, the LOESS smoothing parameter is set at: 0.02, 0.10. Finally, a raw unsmoothed recombination map is provided for Dsim, in 1 kb windows, and for Dmel at 101 kb. Details for how this raw Dsim map was produced are in our paper, Howie et et al. 2019. The R script used to produce differently smoothed recombination maps for both species is also provided as an R-Markdown file.
2_recombination_maps_mimicrEE_ZIP
Recombination map files for D. simulans and D. melanogaster, at different smoothing resolutions. This set of files is the same as the CSV files, except that they formatted for use with MimicrEE simulation software. Also, we provide “Dsim_recombination_map_LOESS_100kb_1.txt”, in which LOESS smoothing equivalent to 100 kb windows is applied through an alternative R algorithm. The standard files provided are, again, the moving median and LOESS smoothed maps. For Dsim, moving median files are provided at resolutions: 5, 25, 101, 501, and 2501 kb. For Dmel, the moving median files are provided at resolutions: 501, and 2501 kb, due to a lower resolution initial map. For Dsim the LOESS files are at 0.001, 0.005, 0.02, 0.10. These levels are equivalent to moving median windows of 25, 101, 501, and 2501 kb. For Dmel, the LOESS smoothing parameter is set at: 0.02, 0.10. A raw recombination map is provided for Dsim, in 1 kb windows and for Dmel in 101 kb windows. Details for how this raw map was produced can be found in our paper, Howie et et al. 2019. The R script used to produce differently smoothed recombination maps is provided as an R-Markdown file.
3_dmel-fimo_ZIP.tsv
Motif density files for D. melanogaster, which are output from FIMO. These can be used to plot motif density along the genome or, in combination with the recombination maps, to test the relationship between recombination rates and motif density. The script is provided in the R Markdown file.
4-dsim-fimo_ZIP.tsv
Motif density files for D. simulans, which are output from FIMO. These can be used to plot motif density along the genome or, in combination with the recombination maps, to test the relationship between recombination rates and motif density. The script is provided in the R Markdown file.
5_figures+mapfiles
R-Markdown file. This contains the in house scripts used to process raw data files and run all downstream analyses reported in our paper. The script includes code to smooth recombination maps, generate motif density files, test correlations and and fit linear models to examine the relationship between motif density and recombination rate, at different smoothing scales, in Dmel and Dsim. Details can be found in our paper.
6_PrepareFasta_X
Phased Haplotypes - that is, SNP and genomic location - used to produce the Dsim Recombination Map, for the X chromosome. We crossed focal males from 189 isofemale lines, from Florida, to virgin “reference” females, and then individually sequenced the F1 progeny. From the raw BAM files, we produced phased haplotype files, in which reference alleles are removed and SNPs called. This was done in FreeBayes: segment size 1kb, alpha 0.05, theta 0.04. Full details are provided in our paper. Here, we provide 189 phased haplotypes for the X chromosome. This is essentially information about SNPs and their genomic positions.
PrepareFasta_X.tar.gz
7_PrepareFasta
Phased Haplotypes - that is, SNP and genomic location - used to produce the Dsim Recombination Map, for the main autosomes, 2L, 2R, 3L, 3R. We crossed focal males from 189 isofemale lines, from Florida, to virgin “reference” females, and then individually sequenced the F1 progeny. From the raw BAM files, we produced phased haplotype files, in which reference alleles are removed and SNPs called. This was done in FreeBayes: segment size 1kb, alpha 0.05, theta 0.04. Full details are provided in our paper. Here, we provide 189 phased haplotypes for the autosome arms 2L, 2R, 3L, 3R. This is essentially information about SNPs and their genomic positions.
PrepareFasta.tar.gz