Forecasting climate change response in an alpine specialist songbird reveals the importance of considering novel climate
DeSaix, Matthew et al. (2022), Forecasting climate change response in an alpine specialist songbird reveals the importance of considering novel climate, Dryad, Dataset, https://doi.org/10.5061/dryad.stqjq2c4r
Species persistence in the face of climate change depends on both ecological and evolutionary factors. Here, we integrate ecological and whole-genome sequencing data to describe how populations of an alpine specialist, the Brown-capped Rosy-Finch (Leucosticte australis) may be impacted by climate change. We sampled 116 Brown-capped Rosy-Finches from 11 sampling locations across the breeding range. Using 429,442 genetic markers from whole-genome sequencing, we described population genetic structure and identified a subset of 436 genomic variants associated with environmental data. We modelled future climate change impacts on habitat suitability using ecological niche models (ENMs) and impacts on putative local adaptation using gradient forest models (a genetic-environment association analysis; GEA). We used the metric of niche margin index (NMI) to determine regions of forecasting uncertainty due to climate shifts to novel conditions. Population genetic structure was characterized by weak genetic differentiation, indicating potential ongoing gene flow among populations. Precipitation as snow had high importance for both habitat suitability and changes in genetic variation across the landscape. Comparing ENM and gradient forest models with future climate predicted suitable habitat contracting at high elevations and population allele frequencies across the breeding range needing to shift to keep pace with climate change. NMI revealed large portions of the breeding range shifting to novel climate conditions. Our study demonstrates that forecasting climate vulnerability from ecological and evolutionary factors reveals insights into population-level vulnerability to climate change that are obfuscated when either approach is considered independently. For the Brown-capped Rosy-Finch, our results suggest that persistence may depend on rapid adaptation to novel climate conditions in a contracted breeding range. Importantly, we demonstrate the need to characterize novel climate conditions that influence uncertainty in forecasting methods.
We sequenced feather and blood samples from 116 individuals spanning 11 sites across the Brown-capped Rosy-Finch breeding distribution (Table 1). Samples were collected during the breeding season of 2017 and 2018. Individuals from the Lost Man Lake and Independence Lake sites were combined as a single sampling unit for subsequent analyses based on their proximity (< 1 km) and the low sample sizes (5 and 1 individuals, respectively). Engineer Mountain and Horseshoe Basin sites were also in close proximity (< 5 km), but we retained them as separate sampling units due to the larger number of individuals per site (8 and 18 individuals, respectively).
We extracted DNA from blood samples using the standard protocol for Qiagen DNEasy Blood and Tissue Kits and we modified the protocol to maximize DNA yield from feathers. Whole genome sequencing libraries were prepared following modifications of Illumina’s Nextra Library Preparation protocol. Pooled libraries were sequenced on HiSeq 4000 lanes at Novogene Corporation Inc. All sequence data were quality filtered (GATK: McKenna et al., 2010; BCFtools: Li, 2011; Samtools: Li et al., 2009) and aligned (Burrows-Wheeler Aligner software; Li & Durbin, 2009) to a high-quality Brown-capped Rosy-Finch reference genome that was created by Dovetail Genomics through 10x de novo assembly and HiRise Scaffolding. The reference genome was created from liver samples of the Brown-capped Rosy-Finch (Denver Museum of Nature and Science samples DMNS52416 and DMNS52417). The reference genome was annotated with the most recent zebra finch annotations available (NCBI GCA_008822105.2) using the program Liftoff (Shumate & Salzberg, 2021). For the input into all subsequent analyses, we extracted high-quality single-nucleotide polymorphisms (SNPs; Supporting information).
We trimmed the sequence data to remove potential PCR artifacts using the program TrimGalore version 0.6.5 (https://github.com/FelixKrueger/TrimGalore), a wrapper for Cutadapt (Martin, 2011). PCR duplicates need to be removed to ensure high-quality sequence data in downstream processes, such as creating scaffolds in whole-genome sequencing. We used the Burrows-Wheeler Aligner software version 0.7.17 (Li & Durbin, 2009) to map reads to the reference genome. After mapping, the resulting SAM files were sorted, converted to BAM files, and then indexed using Samtools version 1.9 (Li et al., 2009). We marked read duplicates with MarkDuplicatesSpark from GATK version 220.127.116.11 (McKenna et al., 2010) and overlapping reads with the clipOverlap function from bamUtil (https://genome.sph.umich.edu/wiki/BamUtil:_clipOverlap). Depth of sequencing coverage and reference genome mapping statistics were calculated in samtools (Li, 2011). Mean and range of depth of coverage of individuals for each sampling site were summarized to determine if there was sequencing bias among sites.
To account for read errors from sequencing machines we performed base quality score recalibration. Base quality score recalibration involved identifying high confidence variants by taking the intersection of single-nucleotide polymorphism (SNP) data sets identified by two variant calling algorithms (GATK; McKenna et al., 2010; Samtools; Li & Durbin, 2009). We used the resulting set of SNPs as the reference data for recalibrating all BAM files with the BaseRecalibrator and ApplyBQSR functions in GATK (McKenna et al., 2010). Finally, the base score recalibrated BAM files were used to create VCF files with the HaplotypeCaller function of GATK (McKenna et al., 2010; Poplin et al., 2017). In order to facilitate faster computation time, BAM files were processed in approximately 1.5 MB intervals and the resulting VCF files were combined with the GatherVCFs function in GATK. This provided a single VCF file of a master set of genomic sites to be filtered appropriately for the subsequent analyses. All indexing of VCF files was performed with the index function in BCFtools version 1.9 (Li, 2011).
For the input into these analyses, we extracted high-quality single nucleotide polymorphisms (SNPs) from the master VCF file by filtering with the view function in BCFtools (Li, 2011). Specifically, we filtered for biallelic sites (-m 2 -M 2), minor allele frequency of at least 0.1 (--min-af 0.1, --max-af 0.9), sequencing quality score of at least 30 (‘QUAL > 30’) and were missing from less than 10% of the individuals sampled ('F_MISSING < 0.1'). To remove paralogs, we filtered variants by minimum and maximum depth thresholds (-i 'AVG(FORMAT/DP)>5.4 & AVG(FORMAT/DP)<12'). The minimum threshold was set as the mean depth of the sequencing data multiplied by the minimum number of individuals without missing data detailed in the previous step, and the maximum being twice the mean depth of the sequencing data multiplied by the total number of individuals. To remove linked SNPs, variants were pruned by linkage disequilibrium (r > 0.2) using Plink2.0 (Purcell et al., 2007). We filtered the data set to only include non-sex-linked autosomal variation by removing variants with pairwise FST > 0.01 for male and female birds.
See README file for details.
National Geographic, Award: WW-202R-17
National Science Foundation, Award: NSF-1942313