Detecting selected haplotype blocks in evolve and resequence experiments
Data files
Jan 22, 2021 version files 77.47 GB
-
haplovalidate.tar.gz
Abstract
Shifting from the analysis of single nucleotide polymorphisms to the reconstruction of selected haplotypes greatly facilitates the interpretation of Evolve and Resequence (E&R) experiments. Merging highly correlated hitchhiker SNPs into haplotype blocks reduces thousands of candidates to few selected regions. Current methods of haplotype reconstruction from Pool-Seq data need a variety of data-specific parameters that are typically defined ad hoc and require haplotype sequences for validation. Here, we introduce haplovalidate, a tool which detects selected haplotypes in Pool-seq time series data without the need for sequenced haplotypes. Haplovalidate makes data-driven choices of two key parameters for the clustering procedure, the minimum correlation between SNPs constituting a cluster and the window size. Applying haplovalidate to simulated and experimental E&R data reliably detects selected haplotype blocks with low false discovery rates. Importantly, our analyses identified a restriction of the haplotype block-based approach to describe the genomic architecture of adaptation. We detected a substantial fraction of haplotypes containing multiple selection targets. These blocks were considered as one region of selection and therefore led to under-estimation of the number of selection targets. We demonstrate that the separate analysis of earlier time points can significantly increase the separation of selection targets into individual haplotype blocks. We conclude that the analysis of selected haplotype blocks has great potential for the characterisation of the adaptive architecture with E&R experiments.