Chromosome VCF files and 1Mb recombination rate estimations for: Fine-scale recombination rate variation and association with genomic features in a butterfly
Data files
Mar 31, 2023 version files 5.62 GB
-
1-filtered.vcf.gz
-
10-filtered.vcf.gz
-
11-filtered.vcf.gz
-
12-filtered.vcf.gz
-
13-filtered.vcf.gz
-
14-filtered.vcf.gz
-
15-filtered.vcf.gz
-
16-filtered.vcf.gz
-
17-filtered.vcf.gz
-
18-filtered.vcf.gz
-
19-filtered.vcf.gz
-
2-filtered.vcf.gz
-
20-filtered.vcf.gz
-
21-filtered.vcf.gz
-
22-filtered.vcf.gz
-
23-filtered.vcf.gz
-
24-filtered.vcf.gz
-
25-filtered.vcf.gz
-
26-filtered.vcf.gz
-
27-filtered.vcf.gz
-
28-filtered.vcf.gz
-
29-filtered.vcf.gz
-
3-filtered.vcf.gz
-
4-filtered.vcf.gz
-
5-filtered.vcf.gz
-
6-filtered.vcf.gz
-
7-filtered.vcf.gz
-
8-filtered.vcf.gz
-
9-filtered.vcf.gz
-
README.md
-
RR_chr_1.txt
-
RR_chr_10.txt
-
RR_chr_11.txt
-
RR_chr_12.txt
-
RR_chr_13.txt
-
RR_chr_14.txt
-
RR_chr_15.txt
-
RR_chr_16.txt
-
RR_chr_17.txt
-
RR_chr_18.txt
-
RR_chr_19.txt
-
RR_chr_2.txt
-
RR_chr_20.txt
-
RR_chr_21.txt
-
RR_chr_22.txt
-
RR_chr_23.txt
-
RR_chr_24.txt
-
RR_chr_25.txt
-
RR_chr_26.txt
-
RR_chr_27.txt
-
RR_chr_28.txt
-
RR_chr_29.txt
-
RR_chr_3.txt
-
RR_chr_4.txt
-
RR_chr_5.txt
-
RR_chr_6.txt
-
RR_chr_7.txt
-
RR_chr_8.txt
-
RR_chr_9.txt
Abstract
Genetic recombination is a key molecular mechanism that has profound implications on both micro- and macro-evolutionary processes. However, the determinants of recombination rate variation in holocentric organisms are poorly understood, in particular in Lepidoptera (moths and butterflies). The wood white butterfly (Leptidea sinapis) shows considerable intraspecific variation in chromosome numbers and is a suitable system for studying regional recombination rate variation and its potential molecular underpinnings. Here, we developed a large whole-genome resequencing data set from a population of wood whites to obtain high-resolution recombination maps using linkage disequilibrium information. The analyses revealed that larger chromosomes had a bimodal recombination landscape, potentially due to interference between simultaneous chiasmata. The recombination rate was significantly lower in subtelomeric regions, with exceptions associated with segregating chromosome rearrangements, showing that fissions and fusions can have considerable effects on the recombination landscape. There was no association between the inferred recombination rate and base composition, supporting a negligible influence of GC-biased gene conversion in butterflies. We found significant but variable associations between the recombination rate and the density of different classes of transposable elements (TEs), most notably a significant enrichment of SINEs in genomic regions with higher recombination rate. Finally, the analyses unveiled significant enrichment of genes involved in farnesyltranstransferase activity in recombination cold-spots, potentially indicating that expression of transferases can inhibit formation of chiasmata during meiotic division. Our results provide novel information about recombination rate variation in holocentric organisms and has particular implications for forthcoming research in population genetics, molecular/genome evolution and speciation.
Methods
DNA extraction
DNA was extracted following two different protocols. In both cases, the dissected tissue was digested overnight in Laird’s buffer and homogenized with 20μl of proteinase K (20mg/ml, >600 mAU/ml), followed by incubation with RNase A at 37°C for 30 minutes. DNA was extracted from thoraces using salt extraction; 300 of NaCl (5M) was added, followed by centrifugation for 15 minutes at 13,000 revolutions per minute (rpm). Three washing steps were completed with one volume of 70% ethanol and centrifuging for five minutes at maximum speed. The remaining pellet was air-dried and then resuspended in 30 μl of MilliQ H2O. For the abdomens, a phenol-chloroform extraction protocol was used. Two cycles of phenol:chloroform:isoamyl alcohol (25:24:1) addition and centrifugation for five minutes at 13,000 rpm were completed, plus a third cleaning cycle using only chloroform. Precipitation of DNA was achieved by adding 2x volumes isopropanol + 0.1x 3M NaAc, incubating at -18°C overnight and centrifuging for 15 minutes at 13,000 rpm. The final pellet was resuspended with 30 μl of MilliQ H2O. DNA purity was assessed with NanoDrop, and concentration measured with Qubit DNA Broad Range.
Sequencing
To capture the genetic variation in the population in Sweden, 84 individuals from different geographic regions and the highest DNA quality were selected for analysis. Library preparation for all 84 samples using the TruSeq PCR-free kit followed by multiplexing, and sequencing on two NovaSeq 6000 S4 lanes with 2×150 bp reads, were performed at the National Genomics Infrastructure (NGI), Stockholm.
Read trimming
Illumina sequencing adapters were trimmed by eliminating the first fifteen base pairs (bp) on each end of the raw reads with CutAdapt 1.9.1 (Martin 2011), filtered on Q-score < 30 and a minimum length of 30 bp. Read quality after cleaning was assessed with FastQC (Andrews 2010). Before filtering, an average of 4.3 million reads per sample were obtained, and 2.5% were filtered out.
Mapping and filtering
For each individual, paired-end reads were mapped to the reference genome with bwa v0.7.17 (Li and Durbin 2009). Samtools v1.10 (Li et al. 2009) was used to select reads with paired information. MarkDuplicatesSpark as implemented in GATK v4.1.4.1 (McKenna et al. 2010) was used to eliminate duplicated regions with the –remove-sequencing-duplicates option.
Recombination rate estimation
Recombination rate was estimated with pyrho, and averaged over 1Mb intervals to assess the association of regional recombination rate with several genomic features.
Usage notes
The files can be opened in any editor software, both with GUI (such as BBEdit) and in the command line (such as nano).