Skip to main content
Dryad

Selection on a small genomic region underpins differentiation in multiple color traits between two warbler species

Cite this dataset

Wang, Silu et al. (2021). Selection on a small genomic region underpins differentiation in multiple color traits between two warbler species [Dataset]. Dryad. https://doi.org/10.5061/dryad.bnzs7h470

Abstract

Speciation is one of the most important processes in biology, yet the study of the genomic changes underlying this process is in its infancy. North American warbler species Setophaga townsendi and S. occidentalis hybridize in a stable hybrid zone, following a period of geographic separation. Genomic differentiation accumulated during geographic isolation can be homogenized by introgression at secondary contact, while genetic regions that cause low hybrid fitness can be shielded from such introgression. Here we examined the genomic underpinning of speciation by investigating: (1) the genetic basis of divergent pigmentation traits between species, (2) variation in differentiation across the genome, and (3) the evidence for selection maintaining differentiation in the pigmentation genes. Using tens of thousands of single nucleotide polymorphisms (SNPs) genotyped in hundreds of individuals within and near the hybrid zone, genome-wide association mapping revealed a single SNP associated with cheek, crown, breast coloration, and flank streaking, reflecting pleiotropy (one gene affecting multiple traits) or close physical linkage of different genes affecting different traits. This SNP is within an intron of the RALY gene, hence we refer to it as the RALY SNP. We then examined between-species genomic differentiation, using both genotyping-by-sequencing and whole genome sequencing. We found that the RALY SNP is within one of the highest peaks of differentiation, which contains three genes known to influence pigmentation: ASIP, EIF2S2, and RALY (the ASIP-RALY gene block). Heterozygotes at this gene block are likely of reduced fitness, as the geographic cline of the RALY SNP has been narrow over two decades. Together, these results reflect at least one barrier to gene flow within this narrow (~200kb) genomic region that modulates plumage difference between species. Despite extensive gene flow between species across the genome, this study provides evidence that selection on a phenotype-associated genomic region maintains a stable species boundary. 

Usage notes

To make Figure 1 & 3, please use this file. 
-RALY.genopheno.EV.csv            
This data table has the following columns used for figure 1:
genomic.EV1 (genomic eigenvector 1);
genomic.EV2 (genomic eigenvector 2); 
W (longitude); 
N (latitude).
This data table has the following columns used for figure 3:
genomic.EV1 (genomic eigenvector 1);
genomic.EV2 (genomic eigenvector 2); 
crown.b.agecr    (crown coloration, age-corrected)
cheek.b.agecr (cheek coloration, age-corrected)
breast.y.b.agecr (breast coloration, age-corrected)
flank.int.agecr (flank streaking, age-corrected)

-Individual.background.genotype.phenotype.csv        
This data table has the following column used for figure 1: 
plumage.hybridindex (mean plumage hybrid index of the eight plumage landmark scores). 
In addition, this table has the "plateID" correspond to the .vcf file. 
For sampling location of each individual, please use the coordinates specified in the N and W columns. 

Figure 2
The following csv files contain the GWAS scans of 7 phenotypes for figure 2. 
RIGHT.cheek.b_association.full.jun.2020.csv
back.a_association.full.jun.2020.csv
bib.b.association.full.jun.2020.csv
breast.y.b_association.full.jun.2020.csv
breast.y.ext_association.full.jun.2020.csv
crown.b_association.full.jun.2020.csv
flank.int_association.full.jun.2020.csv

In each of the files above (corresponds with the association of each phenotype with genotype), there are a few columns involved in making Figure 2: 
1, CHROM (chromozome);
2, Position (base position along the chromosome);
3, A1 (allele 1 at this position);
4, A2 (allele 2 at this position);
5, Pc1df (p-value after correcting for genomic control).

Figure 4
-WGSslidingwindow.csv            
In this data table, the columns used to generate figure 4 is: 
CHROM (chromosome);
midPos (mid point position of the sliding window);
Fst (Weir & Cockerham fixation index)

-HEWA.TOWA.SNP.FST.csv        463.55 kB    Submitted    
In this data table, the columns used to generate figure 4 is: 
CHROM (chromosome);
POS (base position of the SNP);
WEIR_AND_COCKERHAM_FST (SNP-specific Fixation Index). 

Figure 5
-boot.raly.w2.100000.aug.201...       
-boot.plum.w2.100000.csv        2.52 MB    Submitted    
-boot.nu.pc1.w2.100000.july....
Each of the above data table corresponds the 100,000 bootstrap scores of change in the squared cline (RALY SNP cline, plumage cline, genomic cline) width between the two sampling periods.

-TOWA.HEWA.hybridzone.plate1...        245.26 MB    Submitted    
VCF file for all analysis. Please use the plateID of the Individual.background.genotype.phenotype.csv file to match background information of each individual.

Funding

Natural Sciences and Engineering Research Council, Award: 331015731