Identifying the genetic basis of phenotypic variation and its relationship with the environment is key to understanding how local adaptations evolve. Such patterns are especially interesting among populations distributed across habitat gradients, where genetic structure can be driven by isolation by distance (IBD) and/or isolation by environment (IBE). Here, we used variation in ~1,600 high-quality SNPs derived from paired-end sequencing of double-digest restriction site-associated DNA (ddRAD-Seq) to test hypotheses related to IBD and IBE in the Yucatan jay (Cyanocorax yucatanicus), a tropical bird endemic to the Yucatán Peninsula. This peninsula is characterized by a precipitation and vegetation gradient—from dry to evergreen tropical forests—that is associated with morphological variation in this species. We found a moderate level of nucleotide diversity (π = .008) and little evidence for genetic differentiation among vegetation types. Analyses of neutral and putatively adaptive SNPs (identified by complementary genome-scan approaches) indicate that IBD is the most reliable explanation to account for frequency distribution of the former, while IBE has to be invoked to explain those of the later. These results suggest that selective factors acting along a vegetation gradient can promote local adaptation in the presence of gene flow in a vagile, nonmigratory and geographically restricted species. The putative candidate SNPs identified here are located within or linked to a variety of genes that represent ideal targets for future genomic surveys.
Cy_FASTQ
FASTq sequences of 38 individuals used, with their corresponding barcode from the sequencing. See file "Cy_GENERAL-DATA-ID-GEO-CATALOG.txt" to know the general information of the samples.
Cy_samplesFASTQ.tar.gz
Cy_GENERAL-DATA-ID-GEO-CATALOG
General information of the 38 samples: catalog-voucher from the National Ornithology Collection (UNAM, Mexico), sample ID (barcodes from the sequencing) geographic position (longitude and latitude), the corresponding operative-geogaraphic-unit (OGU), vegetation type and number of the sampling point (see Figure 1 in the article).
Cy_Bioclim-Morpho
Bioclimatic and morphometric data of the 38 samples. In table are the Sample-ID/sequencing barcode, the corresponding operative-geographic-unit, six morphometric measures (WIL, TAL, TRL, CUL, BIW, BID) and all the 19 bioclimatic data obtained from WorldClim databases.
Cy_Gsnap-alignment
Sam files of the 38 individuals with two types of alignment to a Pseudo-Reference-Genome (PRG), to the American Crow (Cy_AmericanCrow) and the Zebra Finch (Cy_ZebraFinch), performed with Gsnap and after all filters (clone filter, optical duplicates and mapping quality).
Cy_Gsnap_alignment.tar.gz
Cy_STACKS_output
STACKS output from the correction module (rx_stacks) of ref_map (two PRG: PRG_AmericanCrow and PRG_ZebraFinch) and de_novo (run 1 and run 2 separate: Denovo_R1 and Denovo_R2) pipelines and populations maps used (Cy_2POP.txt and Cy_3POP.txt). Included are the catalog.tags, de-novo assemblies and output of the populations module. Each directory includes summary statistics per locus and overall, and all SNP data in STRUCTURE, FASTA, Genpop, PLINK and VCF format.
Cy_LFMM_input
Input used in LFMM analyses including genetic (Cy.lfmm) and environmental (Cy.lfmm.env) data for each individual. The order of the samples are in file "ORDER-SAMPLES-plink-LFMM.txt"
Cy_Baypass_input
Input used in BayPass analyses including genetic (Cy.BayPass2allele.count.txt) and environmental (Cy.BayPass-var-X.txt) data. The environmental data are in separate files for each of the final nine climatic variables used (e.g Cy.BayPass-var-bio4.txt). The order of the samples are in file "ORDER-SAMPLES-BayPass.txt".
Cy_Bedassle
Input and output BEDASSLE files. Genetic data for each population are in the allele.counts and sample.sizes files. Environmental data matrices are located in the .csv files. Output is in R object files and statistic values in PARAM and INTERVALS .txt files. Files correspond to the candidate-SNPs and non-candidate-SNPS, per each allele 1 and allele 2 (A1 and A2) and their replicate run (Rep_A1 and Rep_A2).
Cy_Fasta_BLAST
FASTA file of consensus sequences containing all SNPs (CLOCUS_) used in the study and BLAST (MATCHCLOCUS and Q_Query=) results of these loci.