Lineage-specific adaptation to climate involves flowering time in North American Arabidopsis lyrata
Data files
Jan 13, 2020 version files 78.51 MB
-
Walden_Lyrata_Genelength.txt
-
Walden_Lyrata_GOenrichment.xlsx
-
Walden_Lyrata_GWASresults.bed
-
Walden_Lyrata_Topoutliers.xlsx
Abstract
Adaptation to local climatic conditions is commonly found within species, but whether it involves the same intraspecific genomic variants is unknown. We studied this question in North American Arabidopsis lyrata, whose current distribution is shaped by post-glacial range expansion from two refugia, resulting in two distinct genetic clusters covering comparable climatic gradients. Using whole-genome pooled sequence data of 41 outcrossing populations, we identified loci associated with three niche-determining climatic variables in the two clusters and compared these outliers. Little evidence was found for parallelism in climate adaptation for single nucleotide polymorphisms (SNPs) and for genes with an accumulation of outlier SNPs. Significantly increased selection coefficients supported them as candidates of climate adaptation. However, the fraction of gene ontology (GO) terms shared between clusters was higher compared to outlier SNPs and outlier genes, suggesting that selection acts on similar pathways but not necessarily the same genes. Enriched GO terms involved responses to abiotic and biotic stress, circadian rhythm and development, with flower development and reproduction being among the most frequently detected. In line with GO enrichment, regulators of flowering time were detected as outlier genes. Our results suggest that while adaptation to environmental gradients on the genomic level are lineage-specific in A. lyrata, similar biological processes seem to be involved. Differential loss of standing genetic variation – likely driven by genetic drift – can in part account for the lack of parallel evolution on the genomic level.
Methods
Details of data collection and processing can be obtained from the publication. In short, this dataset consists of four files:
(1) Results of six GWAS analyses (eastern and western genetic cluster of North American Arabidopsis lyrata subsp. lyrata with three climatic variables each - minimum temperature in early spring, precipitation of the wettest quarter, and moisture availability). The table contains genomic coordinates (bed file format) and Bayes factors obtained for each climatic variable/genetic cluster using the software BayPass.
(2) An excel sheet listing gene ontology outliers obtained using the snp2go and REVIGO software. The REVIGO output including false discovery rates and p-values is given.
(3) An excel sheet listing outlier genes obtained following the method described by Yeaman et al., 2016 (Science) including test statistics.
(4) A table containing all genes, for which SNPs were detected in the dataset, including SNP content and gene length.