Genomic divergence and introgression between cryptic species of a widespread North American songbird
Data files
Sep 29, 2023 version files 3.25 GB
-
adapter_files_trimming.zip
29.15 KB
-
Askelson_etal_bioinformatics_v2.txt
11.78 KB
-
README.md
1.02 KB
-
wbnu_plate1to5_411_variants_wholegenome.vcf.gz
3.25 GB
Abstract
Analysis of genomic variation among related populations can sometimes reveal distinct species that were previously undescribed due to similar morphological appearances, and close examination of such cases can provide much insight regarding speciation. Genomic data can also reveal the role of reticulated evolution in differentiation and speciation. White-breasted nuthatches (Sitta carolinensis) are widely distributed North American songbirds that are currently classified as a single species but have been suspected to represent a case of cryptic speciation. Previous genetic analyses suggested four divergent groups, but it was unclear whether these represented multiple reproductively isolated species. Using extensive genomic sampling of over 350 white-breasted nuthatches from across North America and a new chromosome-level reference genome, we asked if white-breasted nuthatches are comprised of multiple species and whether introgression has occurred between divergent populations. Genomic variation of over 300,000 loci revealed four highly differentiated populations (Pacific, n = 45; Eastern, n = 23; Rocky Mountains North, n = 138; and Rocky Mountains South, n = 150) with geographic ranges that are adjacent. We observed a moderate degree of admixture between Rocky Mountain populations but only a small number of hybrids between the Rockies and the Eastern population. The rarity of hybrids together with high levels of differentiation between populations is supportive of populations having some level of reproductive isolation. Between populations, we show evidence for introgression from a divergent ghost lineage of white-breasted nuthatches into the Rocky Mountains South population, which is otherwise closely related to Rocky Mountains North. We conclude that white-breasted nuthatches are best considered at least 3 species and that ghost lineage introgression has contributed to differentiation between the two Rocky Mountain populations. White-breasted nuthatches provide a dramatic case of morphological similarity despite high genomic differentiation, and the varying levels of reproductive isolation among the four groups provide an example of the speciation continuum.
README: Genomic divergence and introgression in a cryptic species complex: Scripts and VCF file
This data archive contains:
- In the file Askelson_etal_bioinformatics_v2.txt : a bioinformatic script example to show how the data was processed.
- In the file adapter_files_trimming.zip: A zip file of .fa files that were used to trim adapters from reads in Trimmomatic.
- In the file wbnu_plate1to5_411_variants_wholegenome.vcf.gz : a compressed VCF file with SNP information from each of the samples used in the study. This VCF file is the initial output of SNP calling without any filtering.
Description of the Data and file structure
The VCF file contains SNP (single nucleotide polymorphism) data for each sample. This is an unfiltered file with unphased genotypes. "./." indicate missing data in the VCF.
Our script shows how the data was processed and has comments in the file describing the steps. These programs were used on Digital Research Alliance of Canada computing resources.
Methods
Genetic samples were collected from the field and museum specimens. DNA was extracted from samples and used to prepare GBS (genotyping-by-sequencing; Elshire et al. 2011) libraries for sequencing. Sample information can be found in the supplementary materials of our manuscript. We have also included an example bioinformatic script for data processing along with the initial VCF file of SNPs prior to any filtering.