The role of recombination dynamics in shaping signatures of direct and indirect selection across the Ficedula flycatcher genome
Data files
Jan 10, 2024 version files 49.15 GB
-
all_sites.all_non_zscaff.all_filt.missing_rmvd.non_z_scaff.callable_sites.merged.bed
617.03 MB
-
coll.ld_recom_converted.w200k_s200k.chrompos.txt
270.32 KB
-
coll.ld_recom_converted.w200k_s200k.txt
269.70 KB
-
coll.snp_recom.pop_scaled.txt
322.20 MB
-
final_var_sites_vcfs.tar.gz
47.76 GB
-
flycatcher_sample_metadata.txt
7.43 KB
-
README.md
2.82 KB
-
taig.ld_recom_converted.w200k_s200k.chrompos.txt
267.66 KB
-
taig.ld_recom_converted.w200k_s200k.txt
267.03 KB
-
taig.snp_recom.pop_scaled.txt
453.32 MB
Abstract
Recombination is a central evolutionary process that reshuffles combinations of alleles along chromosomes and consequently, is expected to influence the efficacy of direct selection via Hill-Robertson interference. Additionally, the indirect effects of selection on neutral genetic diversity are expected to show a negative relationship with recombination rate, as background selection and genetic hitchhiking are stronger when recombination rate is low. However, owing to the limited availability of recombination rate estimates across divergent species, the impact of evolutionary changes in recombination rate on genomic signatures of selection remains largely unexplored. To address this question, we estimate recombination rate in two Ficedula flycatcher species, the taiga flycatcher (F. albicilla) and collared flycatcher (F. albicollis). We show that recombination rate is strongly correlated with signatures of indirect selection and that evolutionary changes in recombination rate between species have observable impacts on this relationship. Conversely, signatures of direct selection on coding sequences show little to no relationship with recombination rate, even when restricted to genes where recombination rate is conserved between species. Thus, using measures of indirect and direct selection that bridge micro- and macro-evolutionary timescales, we demonstrate that the role of recombination rate and its dynamics varies for different signatures of selection.
README: The role of recombination dynamics in shaping signatures of direct and indirect selection across the Ficedula flycatcher genome
https://doi.org/10.5061/dryad.q2bvq83nw
Description of the data and file structure
This repository contains variant call data for autosomal scaffolds for five species of Ficedula flycatcher, and inferred recombination rates for two of the species. The variant call data are in VCF format including only single nucleotide variants, mapped to scaffolds of the collared flycatcher reference genome (v. FicAlb1.5). The data have been generated using GATK HaplotypeCaller followed by GenotypeGVCFs. Variants have been filtered for a minimum genotype quality of 30, minimum depth of 5x and maximum depth of 200x. Repeats and collapsed duplications are masked.
VCF file name:
final_var_sites_vcfs.tar.gz
In addition to the VCFs, callable regions of the genome are given in bed format. These data are generated from a VCF including monomorphic sites and filtered in the same way described above for variable sites.
Callable regions file name:
all_sites.all_non_zscaff.all_filt.missing_rmvd.non_z_scaff.callable_sites.merged.bed
Recombination rates estimated with LDhelmet are given between adjacent SNP pairs and summarized for 200kb windows for collared flycatcher and taiga flycatcher. SNP-pair rates are population-scaled, and provide the average value across 5 independent runs of LDhelmet separately for two different runs of statistical phasing. The window-based estimates are rescaled into cM/Mb by using the collared flycatcher pedigree-based recombination map, and have been filtered for extreme outliers. Window-based estimates are given mapped to scaffolds and translated to chromosomes.
SNP-pair recombination rate files:
taig.snp_recom.pop_scaled.txt
coll.snp_recom.pop_scaled.txt
Scaffold level recombination rate files:
taig.ld_recom_converted.w200k_s200k.txt
coll.ld_recom_converted.w200k_s200k.txt
Chromosome level recombination rate files:
taig.ld_recom_converted.w200k_s200k.chrompos.txt
coll.ld_recom_converted.w200k_s200k.chrompos.txt
Sample metadata is included providing information on the sample ID as written in the VCF file, species, sex, mapping percentage, and estimated sequencing coverage for each sample.
Sharing/Access information
Sequencing data from which the VCF is derived is publicly available on the European nucleotide archive (ENA; http://www.ebi.ac.uk/ena) with the following accession numbers:
- Collared flycatcher: PRJEB22864
- Pied flycatcher and snowy-browed flycatcher: PRJEB7359
- Red-breasted flycatcher and taiga flycatcher: PRJEB43825
Code/Software
Scripts used for analyses are available on GitHub (https://github.com/madeline-chase/flycatcher_recom)
Methods
Variant call data are provided for four species of Ficedula flycatcher from whole genome re-sequencing data. Variant calling was performed with GATK v4. All-sites variant calling was performed, and callable sites determined from this are provided as a bed file. VCF files are provided for variant sites only.
Recombination rate data for taiga flycatcher and collared flycatcher are provided, estimated from patterns of linkage disequilibrium using LD-helmet and converted to cM/Mb based on the collared flycatcher linkage map.