Skip to main content
Dryad logo

Data from: Combining experimental evolution and genomics to understand how seed beetles adapt to a marginal host plant

Citation

Gompert, Zachariah et al. (2020), Data from: Combining experimental evolution and genomics to understand how seed beetles adapt to a marginal host plant, Dryad, Dataset, https://doi.org/10.5061/dryad.3j9kd51dw

Abstract

Genes that affect adaptive traits have been identified, but our knowledge of the genetic basis of adaptation in a more general sense (across multiple traits) remains limited. We combined population-genomic analyses of evolve and resequence experiments, genome-wide association mapping of performance traits, and analyses of gene expression to fill this knowledge gap, and shed light on the genomics of adaptation to a marginal host (lentil) by the seed beetle Callosobruchus maculatus. Using population-genomic approaches, we detected modest parallelism in allele frequency change across replicate lines during adaptation to lentil. Mapping populations derived from each lentil-adapted line revealed a polygenic basis for two host-specific performance traits (weight and development time), which had low to modest heritabilities. We found less evidence of parallelism in genotype-phenotype associations across these lines than in allele frequency changes during the experiments. Differential gene expression caused by differences in recent evolutionary history exceeded that caused by immediate rearing host. Together, the three genomic data sets suggest that genes affecting traits other than weight and development time are likely to be the main causes of parallel evolution, and that detoxification genes (especially cytochrome P450s and beta-glucosidase) could be especially important for colonization of lentil by C. maculatus.

Methods

We analyzed six experimental lines in the current study: the M line, which was originally collected from South India and has since been maintained in the lab on its ancestral mung-bean host \cite{messina1991life, mitchell1991traits}, three lentil-adapted lines (L1, L2, and L14, each independently derived from M), and two reversion lines (L1R and L2R) that were switched back to mung bean after many generations on lentil (Fig.\ \ref{lines}, Table \ref{linesummtab}). The South India M line has been maintained at a census population size of 2000-2500 individuals for $>$300 generations; past genetic analyses suggest a variance effective population size of 1149 beetles \cite{gompert16}. Details on the establishment of L1, L2 and L14 can be found in \cite{messina2009experimentally,gompert16} (L1 and L2) and \cite{rego19} (L14). The reversion lines, L1R and L2R, were initiated to test for genetic trade-offs between performance on mung bean versus lentil. These lines were shifted back onto the ancestral host in order to examine whether there would be a decrease in the ability to use lentil (as predicted by a trade-off hypothesis) \cite{messina2015loss}. Thus, allele frequency change in the lentil lines should reflect adaptation to lentil (and genetic drift), whereas changes in the reversion lines relative to their source lentil lines should reflect adaptation back to mung bean (and perhaps drift to a lesser extent) (past work has attempted to parse the roles of selection and drift \cite{gompert16, rego19}, but here we simply focus on change). Herein, we analyze patterns of genome-wide allele frequency change for combinations of all six of these lines (we ignore two additional lines, L3 and L3R, as we lack trait-mapping data for these lines). Trait-mapping data come from backcross mapping populations created by crossing M with L1, L2 and L14 (denoted BC-L1, BC-L2, and BC-L14). Gene expression data comes from M, L1 and L1R, that is from the source mung bean line, a lentil line, and its corresponding reversion line. We measured gene expression in all three lines when reared in mung bean (L1$^\mathrm{M}$, L1R$^\mathrm{M}$, M$^\mathrm{M}$), and for L1 and L1R when reared in lentil (L1$^\mathrm{L}$, L1R$^\mathrm{L}$) (rearing the M line on lentil for expression data was not possible given the extremely low survival rates).

Usage Notes

This data publication contains the following data and script files:

C_Maculates_Lentil_QTL_Raw_Data_isolations.xlsx - Raw data collected from the trait mapping experiment for all eggs generated from the backcross. Each line cross is located on a separate sheet.

L1_phenotypes.txt - Text file of only the L1 backcross individuals used in the trait mapping analysis. The first column is individual IDs, the second is weight (mg), and the third is development time (days).

L2_phenotypes.txt - Text file of only the L2 backcross individuals used in the trait mapping analysis. The first column is individual IDs, the second is weight (mg), and the third is development time (days).

L14_phenotypes.txt - Text file of only the L14 backcross individuals used in the trait mapping analysis. The first column is individual IDs, the second is weight (mg), and the third is development time (days).

phenos.R - R script used for analysis of phenotypic data of backcrossed individuals.

wrap_qsub_slurm_bwa_mem.pl - Wrapper perl script used to align reads to reference genome.

Callvar.sh - Bash script used to call variants.

vcfFilter.pl - Perl script used to filter variants for the VCF which contained all individuals.

vcfFilterwoCross.pl - Perl script used to filter variants for the VCF which excluded the backcross individuals.

filterSomeMore.pl - Follow-up filtering for all individuals to remove high-coverage loci.

filterSomeMore_woCross.pl - Follow-up filtering for subset of individuals which excluded the backcross.

vcf2gl.pl - Converts VCF to genotype likelihood format.

DPupdate.pl - Script used to update depth field in VCF files after subsetting out backcross individuals.

parse_barcodes768.pl - Perl script used to remove sequence barcodes and replace fastq header with associated individual IDs.

ratesCode.R - R script for calculating  evolutionary rates/change and fitting the associated hidden Markov model.

grabBvs.R - R script for breeding value analyses.

mod_tr_g_L1X_full.txt - Input genotype file for backcross line L1 for genome-wide association mapping with Gemma.

mod_tr_g_L2X_full.txt - Input genotype file for backcross line L2 for genome-wide association mapping with Gemma.

mod_tr_g_L14X_full.txt - Input genotype file for backcross line L14 for genome-wide association mapping with Gemma.

 

Funding

National Science Foundation, Award: DEB-1638768

Utah Agricultural Experiment Station, Award: 9308