Data from: Complex evolutionary processes maintain an ancient chromosomal inversion
Data files
Jun 06, 2023 version files 481.25 MB
-
braker.gtf
-
G_tknulli.txt
-
README.md
-
RedwoodDatCombined.csv
-
sub_perform_comb_og.nex
-
tknulli_chroms_hic_output.fasta.gz
-
TknulliSnps.txt
Abstract
Genome re-arrangements such as chromosomal inversions are often involved in adaptation. As such, they experience natural selection, which can erode genetic variation. Thus, whether and how inversions can remain polymorphic for extended periods of time remains debated. Here we combine genomics, experiments, and evolutionary modeling to elucidate the processes maintaining an inversion polymorphism associated with the use of a challenging host plant (Redwood trees) in Timema stick insects. We show that the inversion is maintained by a combination of processes, finding roles for life-history trade-offs, heterozygote advantage, local adaptation to different hosts, and gene flow. We use models to show how such multi-layered regimes of balancing selection and gene flow provide resilience to help buffer populations against the loss of genetic variation, maintaining the potential for future evolution. We further show that the inversion polymorphism has persisted for millions of years and is not a result of recent introgression. We thus find that rather than being a nuisance, the complex interplay of evolutionary processes provides a mechanism for the long-term maintenance of genetic variation.
Methods
We conducted a laboratory experiment to test for a potential effect of the Perform locus on performance (here growth and survival) in T. knulli reared on Ceanothus or Redwood. For this experiment, each of the 138 T. knulli collected were placed individually in 500 mm plastic containers, with air holes for breathing punched into the lid containers using a needle. Each stick insect was then fed fresh plant material from either Ceanothus or Redwood every second day (when survival was recorded, see below). Host-plant treatment was determined randomly and was independent of the host from which the stick insect was collected. We then measured weight and survival at 15 and 21 d as metrics of performance, and survival (dead or alive) was monitored every second day for the course of the 21-d experiment.
After the performance experiment, we isolated DNA from each of 138 T. knulli. Frozen legs from each individual were ground into powder form using a Qiagen TissueLyser (Qiagen Inc., Valencia, CA). Genomic DNA was then extracted using Qiagen DNeasy Blood and Tissue kits, using a protocol with slightly altered incubation temperatures and times. We used a reduced- representation technique (i.e., genotyping-by-sequencing or GBS) to construct DNA sequencing libraries. Genomic DNA from each individual was digested with two restriction endonucleases, MseI (four base recognition site) and EcoRI (six base recognition site). Illumina adaptors with unique 8 to 10 bp DNA barcodes for each individual were ligated to EcoRI cut sites, and a base Illumina adaptor was ligated to MseI cut sites. Barcoded fragment libraries were then PCR amplified using Illumina primers and a high-fidelity proofreading polymerase (Iproof, BioRad, Hercules, CA). PCR products were pooled into a single library which was then quality screened using an Agilent BioAnalzyer automated electrophoresis device. To reduce the portion of the genome targeted for sequencing, the reduced-representation library was then size-selected for DNA fragments 350 to 450 bp in length using a Pippin Prep quantitative electrophoresis unit (Sage Science, Beverly, MA) at the University of Texas Genome Sequencing and Analysis Facility (UTGSAF). The size-selected library was then sequenced using S2 chemistry and a single lane on an Illumina NovaSeq 4000 at UTGSAF.
We aligned the newly acquired T. knulli GBS reads to our new T. knulli reference genome for SNP calling. We generated the T. knulli reference genome using a combination of PacBio and Illumina reads from Chicago and Hi-C genomic libraries. DNA extraction, library preparation, DNA sequencing, and de novo genome assembly were performed by Dovetail Genomics (now Cantata Bio). Specifically, 105.4 Gbp of PacBio data (∼75.2× coverage) were generated over two SMRT cells and used to build an initial assembly with Falcon assembler (with default options). This initial assembly was further improved and scaffolded using the HiRise assembler (with default options). To do so, 58.8 Gbp of Chicago DNA sequence data and 72.9 Gbp of Hi-C DNA sequence data were generated on a HiSeqX machine (150 bp paired-end reads). A total of three female stick insects were used for the assembly, and the individuals were chosen based on a preliminary analysis (i.e., PCA on genotypes obtained by GBS) that suggested they were homozygous for the Perform PRW allele. The final assembly created using Dovetail’s HiRise Assembly pipeline comprised 1,322,373,696 base pairs (bps) with an N50 of 83,614,905 bps. Based on BUSCO version 4.0.5 with the eukaryota_odb10 database (70 species, 255 BUSCOs), the assembly included 216 complete BUSCOs (212 single copy and four duplicated; 84.7%), 15 fragmented BUSCOs (5.9%) and 24 missing BUSCOs (9.4%). We then used the BRAKER2 pipeline to annotate this genome.
Usage notes
This data set contains only ASCII text files.