Understanding landscape processes driving patterns of population genetic differentiation and diversity has been a long-standing focus of ecology and evolutionary biology. Gene flow may be reduced by historical, ecological or geographic factors, resulting in patterns of isolation by distance (IBD) or isolation by environment (IBE). Although IBE has been found in many natural systems, most studies investigating patterns of IBD and IBE in nature have used anonymous neutral genetic markers, precluding inference of selection mechanisms or identification of genes potentially under selection. Using landscape genomics, the simultaneous study of genomic and ecological landscapes, we investigated the processes driving population genetic patterns of White-breasted Nuthatches (Sitta carolinensis) in sky islands (montane forest habitat islands) of the Madrean Archipelago. Using more than 4000 single nucleotide polymorphisms and multiple tests to investigate the relationship between genetic differentiation and geographic or ecological distance, we identified IBE, and a lack of IBD, among sky island populations of S. carolinensis. Using three tests to identify selection, we found 79 loci putatively under selection; of these, seven matched CDS regions in the Zebra Finch. The loci under selection were highly associated with climate extremes (maximum temperature of warmest month and minimum precipitation of driest month). These results provide evidence for IBE – disentangled from IBD – in sky island vertebrates and identify potential adaptive genetic variation.
Sitta_FASTQ
FASTq sequences for all individuals used in the study (part 1/3).
Sitta_FASTQ2
FASTq sequences for all individuals used in the study (part 2/3).
Sitta_FASTQ3
FASTq sequences for all individuals used in the study (part 3/3).
bayenv2 files
BayEnv2 input and output files. sitta4.env and sitta7.env are environment files for max. temp. of warmest month and min. precip. of driest month for each population. bf_environ files are the output of the program. Also included is the SNP frequency matrix, a file to link SNPs with their locus name, and the covariance matrix calculated in BayEnv2.
bayenv2.zip
bayescan files
Infile and output from BayeScan analyses. Also included is a file associating SNPs with their locus names.
bayescan.zip
BEDASSLE files
Input and output BEDASSLE files. Genetic data for each population are in the allele.counts and sample.sizes files. Environmental data matrices are located in the .csv files. Output is in R object files. Included are two replicates of output using the PCA-based environmental data (Sitta1_BB_MCMC and Sitta2_BB_MCMC files) and two replicates using temperature and precipitation data as environmental data (Sitta_66_2var files).
BEDASSLE.zip
FASTA_BLAST
FASTA file of consensus sequences containing all SNPs used in the study and BLAST+ results of these loci.
LFMM files
Input used in LFMM analyses including genetic (sitta.lfmm) and environmental (sitta.lfmm.env) data for each individual.
LFMM.zip
Model_Points
Training and testing points used for ecological niche modeling.
Sitta_Data
Characteristics of sky islands, including area of forest cover, distance to the niche centroid, distance to Mogollon Rim and Sierra Madre Occidental, and distance to nearest mainland.
Pairwise comparisons
Pairwise comparisons of geographic (distance.km) and environmental (distance.e) distances between populations and genetic differentiation (FST) between populations. Genetic differentiation is based on STACKS output using eight different settings. m1, m5, m10, and m15 indicate stats when changing the minimum read depth for loci to be included in the dataset with a minimum of two individuals per population. limited and 90% indicate when more restrictive sampling is necessary for SNPs to be included (as described in the manuscript). select and no_select indicate genetic differentiation when only loci under selection are included or loci under selection are excluded, respectively.
distance_all.csv
STACKS_output
STACKS output. Included are the catalog.tags file, a population map, a whitelist of loci used in the main analyses, and output of the populations module for eight settings (as described in pairwise comparisons data upload). Each directory includes an FST summary, summary statistics per locus and overall, and all SNP data (in STRUCTURE format).