Migration is a key life history strategy for many animals and requires a suite of behavioural, morphological and physiological adaptations which together form the ‘migratory syndrome’. Genetic variation has been demonstrated for many traits that make up this syndrome, but the underlying genes involved remain elusive. Recent studies investigating migration-associated genes have focussed on sampling migratory and nonmigratory populations from different geographic locations but have seldom explored phenotypic variation in a migratory trait. Here, we use a novel combination of tethered flight and next-generation sequencing to determine transcriptomic differences associated with flight activity in a globally invasive moth pest, the cotton bollworm Helicoverpa armigera. By developing a state-of-the-art phenotyping platform, we show that field-collected H. armigera display continuous variation in flight performance with individuals capable of flying up to 40 km during a single night. Comparative transcriptomics of flight phenotypes drove a gene expression analysis to reveal a suite of expressed candidate genes which are clearly related to physiological adaptations required for long-distance flight. These include genes important to the mobilization of lipids as flight fuel, the development of flight muscle structure and the regulation of hormones that influence migratory physiology. We conclude that the ability to express this complex set of pathways underlines the remarkable flexibility of facultative insect migrants to respond to deteriorating conditions in the form of migratory flight and, more broadly, the results provide novel insights into the fundamental transcriptional changes required for migration in insects and other taxa.
Total distance flown by individual adult Helicoverpa armigera on the flight mills
Flight mill data used in all figures and REML analyses. This data is used in Fig1B, Fig1C and FigS2. Column headings and data are as follows:- ID = the flight mill channel (A or B), year flown ('13' represents 2013), date flown (e.g. Nov13 = November 13th) and mill flown (e.g. _Ch7 = mill #7). origin = origin of the insect population. sex = male (m) or female (f). Total distance flown = the total distance (metres) each individual flew on the flight mill during the course of a single night as recorded by the tethered flight mill.
flight_mill_total distance.csv
RNAseq read count data from China Helicoverpa armigera
The data is a matrix of read counts generated from an RNAseq analysis of six samples of H. armigera. Each RNA sample was extracted from a pool of three whole individual insects. The first column contains the gene identification number. All other column headings represent the individual samples. AY = insects from Anyang. DF = insects from Dafeng. The data was used to determine differential gene expression using the open software packages edgeR and DEseq2.
china_read_count.matrix.csv
RNAseq read count data from Greek Helicoverpa armigera
The data is a matrix of read counts generated from an RNAseq analysis of six samples of H. armigera from Greece. Each RNA sample was extracted from a pool of three whole individual insects. The first column contains the gene identification number. All other column headings represent the individual samples. GR_S = short-distance flying insects from Northern Greece. GR_L = long-distance flying insects from Northern Greece. The data was used to determine differential gene expression using the open software packages edgeR and DEseq2.
greece_read_count.matrix.csv
RNA-seq analysis of flight phenotypes of H. armigera - DEseq2 full output
Output from R package DEseq2 for China and Greece H. armigera RNAseq experiments analysing differential expression between flight phenotypes. Gene ID = gene identification number. baseMean = average of the normalised count values. log2FoldChange = estimated log fold change between sample groups (for China the estimate is measured against AY; for Greece the estimate is measured against GR_S). lfcSE = standard error estimate for logFC. p-value = evidence for an effect of treatment on expression. padj = multiple testing correction for p-value. 'NA' in the p-value or padj column indicates that the gene was excluded from the analysis due to no count data or an extreme outlier.
deseq2_full_output.xlsx
RNAseq analysis of H. armigera flight phenotypes - edgeR_full_output
Output from R package edgeR for China and Greece H. armigera RNAseq experiments analysing differential expression between flight phenotypes. Gene ID = gene identification number. logFC = estimated log fold change between sample groups (for China the estimate is measured against AY; for Greece the estimate is measured against GR_S). logCPM = log2 counts per million. p-value = evidence for an effect of treatment on expression. FDR = multiple testing correction for p-value. Note that the number of rows do not equate to the number of total estimated genes (17001). This is because data with less than 5 read counts across all samples are removed from the analysis.
edgeR_full_output.xlsx
Identified GO terms for Helicoverpa armigera genes
The file provides a list of GO-terms used in a functional enrichment analysis for genes involved in flight activity in Helicoverpa armigera. The list was generated using BLAST-2-GO. A total of 11316 genes (from a total of 17001) have identified GO-terms.
go_terms.txt
Raw qPCR data for RNA-seq validation
Data file consists of two excel sheets. The first sheet labelled 'ct' contains all raw Ct values that informed the qPCR analysis. Column headings are gene IDs for that particular gene. Information on qPCR primers and efficiency can be found in the Supplementary information. 'Treatment' = phenotype tested. 'Biorep' = biological replicate. 'Techrep' = technical replicate. The second sheet ('validation') contains the fold-change comparisons that informed the RNA-seq validation analysis. Validations were performed for each software package - edgeR and DEseq2. Comparisons were made only in instances when the gene was significantly differentially expressed in the RNA-seq.
jones_mol_ecol_qpcr_ct.xlsx