Predictors of genomic diversity within North American squamates
Data files
Mar 20, 2023 version files 501.26 MB
-
GBS_Outfiles_PopGenome.zip
-
Genbank_Mitochondrial_Alignments_Feb2023.zip
-
R_Python_Scripts_Jan2023.zip
-
README
Abstract
Comparisons of intraspecific genetic diversity across species can reveal the roles of geography, ecology, and life history in shaping biodiversity. The wide availability of mitochondrial DNA (mtDNA) sequences in open-access databases makes this marker practical for conducting analyses across several species in a common framework, but patterns may not be representative of overall species diversity. Here, we gather new and existing mtDNA sequences and genome-wide nuclear data (genotyping-by-sequencing; GBS) for 30 North American squamate species sampled in the Southeastern and Southwestern United States. We estimated mtDNA nucleotide diversity for two mtDNA genes, COI (22 species alignments; average 16 sequences) and cytb (22 species; average 58 sequences), as well as nuclear heterozygosity and nucleotide diversity from GBS data for 118 individuals (30 species; four individuals and 6,820–44,309 loci per species). We showed that nuclear genomic diversity estimates were highly consistent across individuals for some species, while other species showed large differences depending on the locality sampled. Range size was positively correlated with both cytb diversity (Phylogenetically Independent Contrasts: R2 = 0.31, p = 0.007) and GBS diversity (R2 = 0.21; p = 0.006), while other predictors differed across the top models for each dataset. Mitochondrial and nuclear diversity estimates were not correlated within species, although sampling differences in the data available made these datasets difficult to compare. Further study of mtDNA and nuclear diversity sampled across species’ ranges is needed to evaluate the roles of geography and life history in structuring diversity across a variety of taxonomic groups.
Methods
The datasets consist of mitochondrial and genotyping-by-sequencing data and trait data for 30 squamate species. New genetic data for 12 species were collected and previously published data for 18 species were combined for analysis. Two mitochondrial genes (COI and cytb) were sequenced via Sanger sequencing and assembled in Geneious; published GenBank sequences were downloaded and aligned for each species. Genetic diversity metrics were generated using the 'pegas' R package. Genotyping-by-sequencing (GBS) data were collected for 12 species and analyzed along with previously published data for 18 species. Data were assembled in ipyrad v. 0.7.28 and genetic diversity metrics were estimated using the R packages 'PopGenome' and 'adegenet'. Trait data were obtained from previously published work. Additional details are provided in the associated manuscript.
Usage notes
R/RStudio