Data from: Sexually-selected differences in warbler plumage are related to a putative inversion on the Z chromosome
Data files
Aug 28, 2024 version files 5.44 GB
-
AllPoolseq_notZ_ad.maf5.Q30.biAll.numChr.vcf.gz
2.34 GB
-
AllPoolseq_Zad.numChr.vcf
721.83 MB
-
East100_Autosomes_VCF.GATKfilteredGenotypes.ConsolChr.biAll.Thin1kb.vcf
2.13 GB
-
East100_Zchr_VCF.GATKfilteredGenotypesNumChr.biAll.vcf.gz
243.86 MB
-
R_code_for_poolFstat_analyses.txt
3.14 KB
-
README.md
5.34 KB
-
Sample_id__for_East100.xlsx
12.35 KB
-
Supporting_Data_Table_S1_WINY_20kb_Fst_Autosome_Genes.xlsx
79.54 KB
-
Supporting_Data_Table_S2_WINY_20kb_Fst_Z_Genes.xlsx
51.82 KB
-
Table_S1.docx
17.66 KB
Abstract
Large structural variants in the genome, such as inversions, may play an important role in producing population structure and local adaptation to the environment through suppression of recombination. However, relatively few studies have linked inversions to phenotypic traits that are sexually selected and may play a role in reproductive isolation. Here we found that geographic differences in the sexually-selected plumage of a warbler, the common yellowthroat (Geothlypis trichas), are largely due to differences on the Z (sex) chromosome (males are ZZ), which contains at least one putative inversion spanning 40% (31/77 Mb) of its length. The inversions on the Z chromosome vary dramatically east and west of the Appalachian mountains, which provides evidence of cryptic population structure within the range of the most widespread eastern subspecies (G. t. trichas). In an eastern (New York) and western (Wisconsin) population of this subspecies, female prefer different male ornaments; larger black facial masks are preferred in Wisconsin and larger yellow breasts are preferred in New York. The putative inversion also contains genes related to vision, which could influence mating preferences. Thus, structural variants on the Z chromosome are associated with geographic differences in male ornaments and female choice, which may provide a mechanism for maintaining different patterns of sexual selection in spite of gene flow between populations of the same subspecies.
README: Data from: Sexually-selected differences in warbler plumage are related to a putative inversion on the Z chromosome
https://doi.org/10.5061/dryad.g1jwstqxd
Description of the data and file structure
Files and variables
Files on Dryad
- Table S1.doc – A copy of this MSWord file is also on the publisher’s website. The file contains a summary of sample sizes and type of sequence data for each yellowthroat population.
- Supporting Data Table S1_WINY_20kb_Fst_Autosome_Genes.xls (807 rows, 11 columns). -- Outlier genes (top 1% of Fst in 20kb windows) between NY (New York) and WI (Wisconsin) populations on the autosomes from pool-seq samples (based on analysis of VCF pool-seq files; see #5 below). The pooled sequences are at Bioproject PRJNA734331 and the reference genome is Genbank Assembly GCA_009764595.1. There were 763 annotated genes and 43 with unknown function. Empty cells under Gene Symbol and Pigment indicate that there is no known gene symbol or the gene is not known to be involved in pigmentation, respectively. Only melanin and carotenoid related genes are indicated in the Pigment column. These are shaded gray and yellow respectively. GeoTri Scaffold refers to the original genome assembly produced by the G10K-Vertebrate Genomes Project (bGeoTri1.pri.cur20191008.fasta.gz) available at: www.genomeark.org/genomeark-all/Geothlypis_trichas.html. GeoTri refers to the gene annotation produced by Sly et al., 2022 (https://doi.org/10.1073/pnas.2120482119). The annotations are in Dataset S01 available at: https://www.pnas.org/doi/10.1073/pnas.2120482119#supplementary-materials.
- Supporting Data Table S2_WINY_20kb_Fst_Z_Genes.xls (273 rows, 11 columns). – Outlier genes (top 1% of Fst in 20kb windows) between NY and WI populations on the Z from pool-seq samples (based on analysis of VCF pool-seq files; see #5 below). The inversion is highlighted in yellow (approx. 33 to 64.4 Mb). Empty cells under Gene Symbol, Geneinfo, UniProt name and Carotenoid or Melanin Pigment indicate that there is no information for that variable and gene. Only melanin and carotenoid related genes are indicated in the Carotenoid or Melanin Pigment column and these are shaded gray for melanin genes and yellow for carotenoid genes. Dashes in the GO - Biological Processes column indicate no known GO for that region. The pooled sequences are at Bioproject PRJNA734331 and the reference genome is Genbank Assembly GCA_009764595.1. Of the 271 genes, 257 had annotations (GeneInfo or GO terms). Note that the Z chromosome is "Super_Scaffold_6" in the original genome produced by the G10K-Vertebrate Genomes Project (bGeoTri1.pri.cur20191008.fasta.gz) available at: www.genomeark.org/genomeark-all/Geothlypis_trichas.html.
- VCF (variant call format) files for the pool-seq samples, separated into autosomes only (AllPoolseq_notZ_ad.maf5.Q30.biAll.numChr.vcf.gz; 2.28 Gb) and the Z chromosome (AllPoolseq_Zad.numChr.vcf; 688 Mb). The samples in this file are based on pooled samples from AZ (Arizona), FL (Florida), GCYE (Gray-crowned yellowthroat in Belize), NY (New York) and WI (Wisconsin). See Table S1 for more details of locations and number of individuals pooled. The pooled sequences are at Bioproject PRJNA734331 and the reference genome is Genbank Assembly GCA_009764595.1. Note that the chromosome number in the Z chromosome file is 6, which is from “Super Scaffold 6” (the Z chromosome) in the original VGP genome (see above). R code for analyzing these files with poolfstat is in “R code for poolFst analyses.txt”. More information about the format of VCF files is at https://samtools.github.io/hts-specs/VCFv4.3.pdf
- VCF (variant call format) files for the 100 individual whole genome samples (Bird Genoscape Project, www.birdgenoscape.org) f*rom the east, separated into autosomes only (East100_Autosomes_VCF.GATKfilteredGenotypes.ConsolChr.biAll.Thin1kb.vcf; 2.28 Gb) and the Z chromosome (East100_Zchr_VCF.GATKfilteredGenotypes.NumChr.biAll.vcf.gz; 238 Mb). The samples in this file are listed in the file “Sample_id_for_100East.xls*”. More information about the format of VCF files is at https://samtools.github.io/hts-specs/VCFv4.3.pdf
- Sample_id_for_100East.xls (101 rows, 5 columns). List of 100 individual whole genome samples from the Bird Genoscape Project (www.birdgenoscape.org). These common yellowthroat samples were used to make the VCF files in #6. Columns are: sample id, State or Province of sample, City of sample and latitude and longitude of sample.
- R code for poolFst analyses.txt. Text file with notes on using poolFstat in R to get Fst values for plotting in Fig. 2 and other analyses.
Access information
Other publicly accessible locations of the data:
- NCBI BioProject PRJNA734331 (contains bam files of the pool-seq sequences)