Data from: Selection on a single locus drives plumage differentiation in the Rufous-collared Sparrow (Zonotrichia capensis)
Data files
Apr 24, 2025 version files 93.72 GB
-
9641_3270_75521_HWYTKBGX5_PL1_ATCACG_R1.fastq.gz
912.07 MB
-
9641_3270_75521_HWYTKBGX5_PL1_ATCACG_R2.fastq.gz
931.07 MB
-
9641_3270_75522_HWYTKBGX5_PL2_CGATGT_R1.fastq.gz
1.54 GB
-
9641_3270_75522_HWYTKBGX5_PL2_CGATGT_R2.fastq.gz
1.57 GB
-
9641_3270_75523_HWYTKBGX5_PL3_TTAGGC_R1.fastq.gz
1.62 GB
-
9641_3270_75523_HWYTKBGX5_PL3_TTAGGC_R2.fastq.gz
1.64 GB
-
9641_3270_75524_HWYTKBGX5_PL4_TGACCA_R1.fastq.gz
2.45 GB
-
9641_3270_75524_HWYTKBGX5_PL4_TGACCA_R2.fastq.gz
2.50 GB
-
9641_3270_75525_HWYTKBGX5_PL5_ACAGTG_R1.fastq.gz
1.83 GB
-
9641_3270_75525_HWYTKBGX5_PL5_ACAGTG_R2.fastq.gz
1.87 GB
-
9641_3270_75526_HWYTKBGX5_PL6_GCCAAT_R1.fastq.gz
1.86 GB
-
9641_3270_75526_HWYTKBGX5_PL6_GCCAAT_R2.fastq.gz
1.90 GB
-
9641_3270_75527_HWYTKBGX5_PL7_CAGATC_R1.fastq.gz
1.83 GB
-
9641_3270_75527_HWYTKBGX5_PL7_CAGATC_R2.fastq.gz
1.87 GB
-
9641_3270_75528_HWYTKBGX5_PL8_ACTTGA_R1.fastq.gz
3.29 GB
-
9641_3270_75528_HWYTKBGX5_PL8_ACTTGA_R2.fastq.gz
3.36 GB
-
9641_3270_75529_HWYTKBGX5_PL9_GATCAG_R1.fastq.gz
3.18 GB
-
9641_3270_75529_HWYTKBGX5_PL9_GATCAG_R2.fastq.gz
3.25 GB
-
9641_3270_75530_HWYTKBGX5_PL10_TAGCTT_R1.fastq.gz
1.42 GB
-
9641_3270_75530_HWYTKBGX5_PL10_TAGCTT_R2.fastq.gz
1.45 GB
-
9641_3270_75531_HWYTKBGX5_PL11_GGCTAC_R1.fastq.gz
2.57 GB
-
9641_3270_75531_HWYTKBGX5_PL11_GGCTAC_R2.fastq.gz
2.62 GB
-
9641_3270_75532_HWYTKBGX5_PL12_CTTGTA_R1.fastq.gz
1.14 GB
-
9641_3270_75532_HWYTKBGX5_PL12_CTTGTA_R2.fastq.gz
1.17 GB
-
9641_3270_75533_HWYTKBGX5_PL13_AGTCAA_R1.fastq.gz
2.48 GB
-
9641_3270_75533_HWYTKBGX5_PL13_AGTCAA_R2.fastq.gz
2.54 GB
-
9641_3270_75534_HWYTKBGX5_PL14_AGTTCC_R1.fastq.gz
2.99 GB
-
9641_3270_75534_HWYTKBGX5_PL14_AGTTCC_R2.fastq.gz
3.06 GB
-
9641_3270_75535_HWYTKBGX5_PL15_ATGTCA_R1.fastq.gz
2.54 GB
-
9641_3270_75535_HWYTKBGX5_PL15_ATGTCA_R2.fastq.gz
2.60 GB
-
9641_3270_75536_HWYTKBGX5_PL16_GTCCGC_R1.fastq.gz
1.37 GB
-
9641_3270_75536_HWYTKBGX5_PL16_GTCCGC_R2.fastq.gz
1.40 GB
-
9641_3270_75537_HWYTKBGX5_PL17_GTGAAA_R1.fastq.gz
1.78 GB
-
9641_3270_75537_HWYTKBGX5_PL17_GTGAAA_R2.fastq.gz
1.81 GB
-
9641_3270_75538_HWYTKBGX5_PL18_GTGGCC_R1.fastq.gz
2.07 GB
-
9641_3270_75538_HWYTKBGX5_PL18_GTGGCC_R2.fastq.gz
2.11 GB
-
adapters.txt
248 B
-
all_sites_whole_genome.vcf.gz
12.87 GB
-
mtDNA.fas
554.92 KB
-
read_me_dryad.txt
1.69 KB
-
README.md
7.70 KB
-
scaffold42_ALL_SITES.vcf
1.14 GB
-
scaffold42_SNPs.vcf
23.80 MB
-
SNPs_whole_genome.vcf
5.14 GB
-
thin25kb.vcf
19.78 MB
Abstract
The Rufous-collared Sparrow (Zonotrichia capensis) shows phenotypic variation throughout its distribution. In particular, the Patagonian subspecies Z. c. australis is strikingly distinct from all other subspecies, lacking the black crown stripes that characterize the species, with a uniformly grey head and overall paler plumage. We sequenced whole genomes of 18 individuals (nine Z. c. australis and nine from other subspecies from northern Argentina) to explore the genomic basis of these color differences and to investigate how they may have evolved. We detected a single ~465-kb divergence peak on chromosome 5 that contrasted with a background of low genomic differentiation and contains the ST5 gene. ST5 regulates RAB9A, which is required for melanosome biogenesis and melanocyte pigmentation in mammals, making it a strong candidate gene for the melanic plumage polymorphism within Z. capensis. This genomic island of differentiation may have emerged because of selection acting on allopatric populations or against gene flow on populations in physical and genetic contact. Mitochondrial DNA indicated that Z. c. australis diverged from other subspecies ~400,000 years ago, suggesting a putative role of Pleistocene glaciations. Phenotypic differences are consistent with Gloger’s rule, which predicts lighter colored individuals in colder and drier climates like that of Patagonia.
https://doi.org/10.5061/dryad.dz08kps7s
Description of the data and file structure
Here you will find raw genomic data and processed data used for analyses from the manuscript titled Selection on a single locus drives plumage differentiation in the Rufous-collared Sparrow (Zonotrichia capensis).
Contents
1) Raw genomic reads obtained for 18 individuals (PL1 to PL18) on an Illumina NextSeq 500 lane at the Cornell Institute for Biotechnology core facility. All individuals were pooled and sequenced (2x150 paired end), so you will find two files (fastq.gz) per individual. File names follow this scheme: 9641_3270_75521_HWYTKBGX5_PL1_ATCACG_R1.fastq.gz, where PL1 is the sample (PL1 to PL18), ATCACG is the adapter index used for that sample, and R1 is read one (you will see two reads, R1 and R2, per sample). All other information in the name corresponds to details of the sequencing run that do not have any relevance for the analyses.
9641_3270_75523_HWYTKBGX5_PL3_TTAGGC_R2.fastq.gz
2) "adpaters.txt" : adapters list to be removed with AdapterRemoval
3) Processed data used for analyses:
a) SNPs_whole_genome.vcf : file containing 10,798,248 SNPs after all filtering. This set of SNPs was used for PCA and the location of divergence peaks across the entire genome. See main text and supporting information for details on how we obtained this set of SNPs from raw sequencing data.
b) thin25kb.vcf : thinned dataset of 41,778 SNPs (one per 25,000 bp) used for STRUCTURE and K-means analyses.
c) all_sites_whole_genome.vcf.gz: compressed file containing ALL SITES (SNPs and invariant) used for genome-wide nucleotide diversity calculations (after filtering)
d) scaffold42_ALL_SITES.vcf : file contanining ALL SITES (SNPs and invariant) for scaffold 42 after filtering used to investigate the emergence of the genomic island of differentiation containing ST5 gene.
e) scaffold42_SNPs.vcf : file containing 49,646 SNPs from scaffold 42 after filtering
e) mtDNA.fas: fasta file containing the alignment of mtDNA used for phylogenetic analyses. For more details on sequence length and how these sequences were obtained see main text and Supporting information.
Files and variables
File: 9641_3270_75522_HWYTKBGX5_PL2_CGATGT_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75522_HWYTKBGX5_PL2_CGATGT_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75521_HWYTKBGX5_PL1_ATCACG_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75538_HWYTKBGX5_PL18_GTGGCC_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75538_HWYTKBGX5_PL18_GTGGCC_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75537_HWYTKBGX5_PL17_GTGAAA_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75537_HWYTKBGX5_PL17_GTGAAA_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75536_HWYTKBGX5_PL16_GTCCGC_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75536_HWYTKBGX5_PL16_GTCCGC_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75535_HWYTKBGX5_PL15_ATGTCA_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75535_HWYTKBGX5_PL15_ATGTCA_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75534_HWYTKBGX5_PL14_AGTTCC_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75534_HWYTKBGX5_PL14_AGTTCC_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75533_HWYTKBGX5_PL13_AGTCAA_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75533_HWYTKBGX5_PL13_AGTCAA_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75532_HWYTKBGX5_PL12_CTTGTA_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75532_HWYTKBGX5_PL12_CTTGTA_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75531_HWYTKBGX5_PL11_GGCTAC_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75531_HWYTKBGX5_PL11_GGCTAC_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75530_HWYTKBGX5_PL10_TAGCTT_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75530_HWYTKBGX5_PL10_TAGCTT_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75529_HWYTKBGX5_PL9_GATCAG_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75529_HWYTKBGX5_PL9_GATCAG_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75528_HWYTKBGX5_PL8_ACTTGA_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75528_HWYTKBGX5_PL8_ACTTGA_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75527_HWYTKBGX5_PL7_CAGATC_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75527_HWYTKBGX5_PL7_CAGATC_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75526_HWYTKBGX5_PL6_GCCAAT_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75526_HWYTKBGX5_PL6_GCCAAT_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75525_HWYTKBGX5_PL5_ACAGTG_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75525_HWYTKBGX5_PL5_ACAGTG_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75524_HWYTKBGX5_PL4_TGACCA_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75524_HWYTKBGX5_PL4_TGACCA_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75523_HWYTKBGX5_PL3_TTAGGC_R2.fastq.gz
Description: raw sequencing data
File: 9641_3270_75523_HWYTKBGX5_PL3_TTAGGC_R1.fastq.gz
Description: raw sequencing data
File: 9641_3270_75521_HWYTKBGX5_PL1_ATCACG_R2.fastq.gz
Description: raw sequencing data
File: adapters.txt
Description: adapters list to be removed from raw sequencing data with AdapterRemoval
Variables
- AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNATCTCGTATGCCGTCTTCTGCTTG:
- AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT:
File: scaffold42_ALL_SITES.vcf
Description: file contanining ALL SITES (SNPs and invariant) for scaffold 42 after filtering used to investigate the emergence of the genomic island of differentiation containing ST5 gene.
File: scaffold42_SNPs.vcf
Description: file containing 49,646 SNPs from scaffold 42 after filtering
File: thin25kb.vcf
Description: thinned dataset of 41,778 SNPs (one per 25,000 bp) used for STRUCTURE and K-means analyses.
File: SNPs_whole_genome.vcf
Description: file containing 10,798,248 SNPs after all filtering. This set of SNPs was used for PCA and the location of divergence peaks across the entire genome. See main text and supporting information for details on how we obtained this set of SNPs from raw sequencing data.
File: mtDNA.fas
Description: fasta file containing the alignment of mtDNA used for phylogenetic analyses. For more details on sequence length and how these sequences were obtained see main text and Supporting information.
File: all_sites_whole_genome.vcf.gz
Description: compressed file containing ALL SITES (SNPs and invariant) used for genome-wide nucleotide diversity calculations (after filtering)
