Genetic parallelism underlying repeated bill divergence in Island Scrub-Jays (Aphelocoma insularis) increases at higher genetic levels of organization
Data files
Nov 25, 2025 version files 187.54 MB
-
chapter_2_individuals_plate_key_final.csv
3.60 KB
-
issj_CASJ_radiator_filtered.vcf
49.04 MB
-
issj_ZEFI_ordered_imputed_Beagle_neutral.vcf
132.33 MB
-
issj_ZEFI_ordered_imputed_Beagle_snp_pos.txt
2.08 MB
-
ISSJ.ZEFIcorr.csv
4.08 MB
-
README.md
3.66 KB
Abstract
Whether the same genes underlie parallel adaptive trait evolution remains an open question in biology. The degree of parallelism is expected to increase at higher hierarchical levels due to the hierarchical nature of the genetic basis of traits (i.e., single nucleotide polymorphisms (SNPs) to genes to pathways to phenotype), which genomic approaches can help elucidate. Previous research shows a large degree of variation in the extent to which phenotypic parallelism shares the same genetic mechanisms in nature. Here, we analyzed the degree of genetic parallelism underlying repeated divergence in bill morphology of Island Scrub-Jays (Aphelocoma insularis), across three naturally replicated pine-oak ecotones on Santa Cruz Island, California, USA. We analyzed 66,503 SNPs generated using restriction site-associated DNA sequencing (RADseq) in 161 Island Scrub-Jays to identify candidate SNPs associated with environmental variation and divergence in bill morphology. We then examined signatures of parallelism in genomic regions containing candidate SNPs and the associated pathways. We found little evidence for parallelism at the SNP or gene level, but substantial parallelism at the pathway level. Our results support the view that the degree of genetic parallelism underlying convergent evolution depends on the genetic level of organization being analyzed.
https://doi.org/10.5061/dryad.hx3ffbgpf
Description of the data and file structure
This repository contains the bulk of the files needed to run the code, except for the raw fasta files for the bioinformatics scripts. Data was generated using BestRAD sequencing.
Files and variables
File: chapter_2_individuals_plate_key_final.csv
Description: metadata for the samples denoting genomic libraries
Variables
- file_name: raw file name from the sequencer
- tissue_number: Colorado State University tissue identifier
- library: which genomic library individuals were sequenced in. Each library was a plate of 93 individuals and 3 duplicates.
File: ISSJ.ZEFIcorr.csv
Description: Coordinates of Zebra Finch genome relative to the California Scrub-Jay aligned Island Scrub-Jay reads. Completed using SatsumaSynteny
Variables
- ZEFI_CHROM: Zebra finch chromosome number
- ZEFIPOS: Zebra Finch locus position
- CHROM: California Scrub-Jay scaffold
- POS: California Scrub-Jay position
- SNP_ID for California Scrub-Jay aligned Island Scrub-Jay loci
File: issj_ZEFI_ordered_imputed_Beagle_snp_pos.txt
Description: Coordinates of Zebra Finch genome for SNP IDs. Needed to reference SNPs back to their Zebra Finch chromosome
Variables
- Column 1: "chromosome", Zebra Finch Chromosome number
- Column 2: "ZEFIPOS", Zebra Finch Chromosome locus position
- Column 3: "snp_id", Zebra Finch SNP ID
File: issj_CASJ_radiator_filtered.vcf
Description: Unimputed filtered dataset of California Scrub-Jay aligned Island Scrub-Jay reads in .vcf format.
File: issj_ZEFI_ordered_imputed_Beagle_neutral.vcf
Description: BEAGLE imputed filtered dataset of California Scrub-Jay aligned Island Scrub-Jay reads in .vcf format.
Code/software
001-bioinformatics scripts final -- copy of scripts used in bioinformatic processing of raw Island Scrub-Jay sequence data.
0015 California Scrub-Jay bioinformatics -- scripts used to complete alignment to California scrub jay reference genome
002- Chapter 2 neutral pop gen final -- scripts used to conduct neutral population genetics.
003- identification of parallel selection by habitat --scripts used to complete genotype-by-environment analyses for all three pine oak ecotones
004- characterization of loci underlying variation in morphology final --scripts used to complete genome-wide association analyses for all three pine oak ecotones
006- genomic parallelism in ISSJ -- RMarkdown detailing use of PAUP to look at phylogenetic history of the jays (but wound up relying on previous results by Langin et al. 2015, because the PAUP didn't add anything that we didn't already know) and genetic parallelism analysis using AFVAPER
Access information
Other publicly accessible locations of the data:
- California Scrub-Jay reference genome https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/028/536/675/GCA_028536675.1_bAphCal1.0.hap1/GCA_028536675.1_bAphCal1.0.hap1_genomic.fna.gz
- Zebra Finch reference genome
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/003/957/565/GCF_003957565.2_bTaeGut1.4.pri/GCF_003957565.2_bTaeGut1.4.pri_genomic.fna.gz
