Local adaptation and the evolution of genome architecture in threespine stickleback
Data files
Sep 25, 2023 version files 164.86 MB
-
genomic_resources.zip
164.85 MB
-
README.md
4.28 KB
Abstract
Theory predicts that local adaptation should favour the evolution of a concentrated genetic architecture, where the alleles driving adaptive divergence are tightly clustered on chromosomes. Adaptation to marine vs. freshwater environments in threespine stickleback has resulted in an architecture that seems consistent with this prediction: divergence among populations is mainly driven by a few genomic regions harbouring multiple quantitative trait loci (QTL) for environmentally adapted traits, as well as candidate genes with well-established phenotypic effects. One theory for the evolution of these “genomic islands” is that rearrangements remodel the genome to bring causal loci into tight proximity, but this has not been studied explicitly. We tested this theory using synteny analysis to identify micro- and macro-rearrangements in the stickleback genome and assess their potential involvement in the evolution of genomic islands. To identify rearrangements, we conducted a de novo assembly of the closely-related tubesnout (Aulorhyncus flavidus) genome and compared this to the genomes of threespine stickleback and two other closely related species. We found that small rearrangements, within-chromosome duplications, and Lineage-Specific Genes (LSGs) were enriched around genomic islands, and that all three chromosomes harbouring large genomic islands have experienced macro-rearrangements. We also found that duplicates and micro-rearrangements are 9.9x and 2.9x more likely to involve genes differentially expressed between marine and freshwater genotypes. While not conclusive, these results are consistent with the explanation that strong divergent selection on candidate genes drove the recruitment of rearrangements to yield clusters of locally adaptive loci.
This archive contains material related to the paper “Local adaptation and the evolution of genome architecture in threespine stickleback”, by Li et al.
GENERAL INFORMATION
- Data (genomic_resources): the genome assemblies, annotations, and lists of genes
- Supplemental information: contains the processed intermediate data files necessary to generate the figures and tables in the paper, as well as the Hi-C config files
- scripts: contains the scripts for running the main analyses and generating the figures/tables
SHARING/ACCESS INFORMATION
- Licenses/restrictions placed on the data: CC0 1.0 Universal (CC0 1.0) Public Domain
- Links to publications that cite or use the data: Li Q, Lindtke D, Rodríguez-Ramírez C, et al. Local adaptation and the evolution of genome architecture in threespine stickleback. Genome Biology and Evolution. 2022,14(6): evac075. https://doi.org/10.1093/gbe/evac075
- Recommended citation for this dataset: Li Q, Lindtke D, Rodríguez-Ramírez C, et al (2022). Data from: Local adaptation and the evolution of genome architecture in threespine stickleback. Dryad Digital Repository. https://doi.org/10.5061/dryad.1c59zw3w3
#########################################################################
LSGs_permissive.txt: permissive filtered putative lineage-specific genes
- Variable List:
ID: gene ID
stick_chrom: chromosome number in the stickleback genome
stick_start: start position on the chromosome
stick_end: end position of the chromosome - Specialized formats: None
#########################################################################
LSGs_stringent.txt: stringent filtered lineage-specific genes
- Variable List:
ID: gene ID
stick_chrom: chromosome number in the stickleback genome
stick_start: start position on the chromosome
stick_end: end position of the chromosome - Specialized formats: None
#########################################################################
Peichel_job_3.hints.cds.reformat.uniq_combined_BROADS1.fasta: stickleback gene sequences
- Variable List: None
- Specialized formats: can be read by any text editor (TextEdit, Notepad, VIM, etc.)
#########################################################################
duplicates.txt: the ID and position of duplicated genes
- Variable List:
ID: gene ID
stick_chrom: chromosome number in the stickleback genome
stick_start: start position on the chromosome
stick_end: end position of the chromosome - Specialized formats: None
#########################################################################
microrearrangements.txt: the ID and position of micro-rearranged genes
- Variable List:
ID: gene ID
stick_chrom: chromosome number in the stickleback genome
stick_start: start position on the chromosome
stick_end: end position of the chromosome - Specialized formats: None
#########################################################################
stb_stickleback_gene_full.gff3: stickleback gene mapping results
- Variable List: None
- Specialized formats: gff3, has 9 required fields of ID (chromosome number), Source, Feature, Start, End, Score, Strand, Phase, Artributes. Can be read by any text editor, or Genome Browsers like [IGV].
#########################################################################
tsnV2.2_hints.cds.reformat.uniq.fasta: tubesnout gene sequences
- Variable List: None
- Specialized formats: can be read by any text editor (TextEdit, Notepad, VIM, etc.)
#########################################################################
tsnV2.2_hints.gtf: de novo gene annotation of the tubesnout genome
- Variable List: None
- Specialized formats: gtf, has 9 required fields of ID (chromosome number), Source, Feature Type, Start, End, Score, Strand, Phase, Artributes. Can be read by any text editor, or Genome Browsers like [IGV].
#########################################################################
tubesnoutV2.2_chr1-23_and_UN.fasta.gz: the assembled tubesnout genome
- Variable List: None
- Specialized formats: this is a compressed fasta file. Unzip first, then it can be read by any text editor (TextEdit, Notepad, VIM, etc.)
Please contact samuel.yeaman@ucalgary.ca with any questions.
This archive contains materials related to the paper "Local adaptation and the evolution of genome architecture in threespine stickleback", by Li et al. There are 3 main directories. The directory of genomic resources is on Dryad, the directories of scripts and supplemental information are on Zenodo.
- Genomic_resources: the genome assemblies, annotations, and lists of genes
- Supplemental information: contains the processed intermediate data files necessary to generate the figures and tables in the paper, as well as the Hi-C config files
- scripts: contains the scripts for running the main analyses and generating the figures/tables
Please contact samuel.yeaman@ucalgary.ca with any questions.