Data from: Isolation by environment in white-breasted nuthatches (Sitta carolinensis) of the Madrean Archipelago sky islands: a landscape genomics approach
Manthey, Joseph D., University of Kansas
Moyle, Robert G., University of Kansas
Published Jun 02, 2015 on Dryad.
Cite this dataset
Manthey, Joseph D.; Moyle, Robert G. (2015). Data from: Isolation by environment in white-breasted nuthatches (Sitta carolinensis) of the Madrean Archipelago sky islands: a landscape genomics approach [Dataset]. Dryad. https://doi.org/10.5061/dryad.8hm4p
Understanding landscape processes driving patterns of population genetic differentiation and diversity has been a long-standing focus of ecology and evolutionary biology. Gene flow may be reduced by historical, ecological or geographic factors, resulting in patterns of isolation by distance (IBD) or isolation by environment (IBE). Although IBE has been found in many natural systems, most studies investigating patterns of IBD and IBE in nature have used anonymous neutral genetic markers, precluding inference of selection mechanisms or identification of genes potentially under selection. Using landscape genomics, the simultaneous study of genomic and ecological landscapes, we investigated the processes driving population genetic patterns of White-breasted Nuthatches (Sitta carolinensis) in sky islands (montane forest habitat islands) of the Madrean Archipelago. Using more than 4000 single nucleotide polymorphisms and multiple tests to investigate the relationship between genetic differentiation and geographic or ecological distance, we identified IBE, and a lack of IBD, among sky island populations of S. carolinensis. Using three tests to identify selection, we found 79 loci putatively under selection; of these, seven matched CDS regions in the Zebra Finch. The loci under selection were highly associated with climate extremes (maximum temperature of warmest month and minimum precipitation of driest month). These results provide evidence for IBE – disentangled from IBD – in sky island vertebrates and identify potential adaptive genetic variation.
FASTq sequences for all individuals used in the study (part 1/3).
FASTq sequences for all individuals used in the study (part 2/3).
FASTq sequences for all individuals used in the study (part 3/3).
BayEnv2 input and output files. sitta4.env and sitta7.env are environment files for max. temp. of warmest month and min. precip. of driest month for each population. bf_environ files are the output of the program. Also included is the SNP frequency matrix, a file to link SNPs with their locus name, and the covariance matrix calculated in BayEnv2.
Infile and output from BayeScan analyses. Also included is a file associating SNPs with their locus names.
Input and output BEDASSLE files. Genetic data for each population are in the allele.counts and sample.sizes files. Environmental data matrices are located in the .csv files. Output is in R object files. Included are two replicates of output using the PCA-based environmental data (Sitta1_BB_MCMC and Sitta2_BB_MCMC files) and two replicates using temperature and precipitation data as environmental data (Sitta_66_2var files).
FASTA file of consensus sequences containing all SNPs used in the study and BLAST+ results of these loci.
Input used in LFMM analyses including genetic (sitta.lfmm) and environmental (sitta.lfmm.env) data for each individual.
Training and testing points used for ecological niche modeling.
Characteristics of sky islands, including area of forest cover, distance to the niche centroid, distance to Mogollon Rim and Sierra Madre Occidental, and distance to nearest mainland.
Pairwise comparisons of geographic (distance.km) and environmental (distance.e) distances between populations and genetic differentiation (FST) between populations. Genetic differentiation is based on STACKS output using eight different settings. m1, m5, m10, and m15 indicate stats when changing the minimum read depth for loci to be included in the dataset with a minimum of two individuals per population. limited and 90% indicate when more restrictive sampling is necessary for SNPs to be included (as described in the manuscript). select and no_select indicate genetic differentiation when only loci under selection are included or loci under selection are excluded, respectively.
STACKS output. Included are the catalog.tags file, a population map, a whitelist of loci used in the main analyses, and output of the populations module for eight settings (as described in pairwise comparisons data upload). Each directory includes an FST summary, summary statistics per locus and overall, and all SNP data (in STRUCTURE format).