The loss of flight ability has occurred thousands of times independently during insect evolution. Flight loss may be linked to higher molecular evolutionary rates because of reductions in effective population sizes (Ne) and relaxed selective constraints. Reduced dispersal ability increases population subdivision, may decrease geographical range size and increases (sub)population extinction risk, thus leading to an expected reduction in Ne. Additionally, flight loss in birds has been linked to higher molecular rates of energy-related genes, probably owing to relaxed selective constraints on energy metabolism. We tested for an association between insect flight loss and molecular rates through comparative analysis in 49 phylogenetically independent transitions spanning multiple taxa, including moths, flies, beetles, mayflies, stick insects, stoneflies, scorpionflies and caddisflies, using available nuclear and mitochondrial protein-coding DNA sequences. We estimated the rate of molecular evolution of flightless (FL) and related flight-capable lineages by ratios of non-synonymous-to-synonymous substitutions (dN/dS) and overall substitution rates (OSRs). Across multiple instances of flight loss, we show a significant pattern of higher dN/dS ratios and OSRs in FL lineages in mitochondrial but not nuclear genes. These patterns may be explained by relaxed selective constraints in FL ectotherms relating to energy metabolism, possibly in combination with reduced Ne.
1. Whole-tree dN/dS and overall substitution rates analysis
This folder contains the input and output files used in and obtained from the program PAML (Yang 2007) for the whole-tree analysis of dN/dS ratios (program component codeml) and overall substitution rates (program component baseml). This folder is organised into subfolders by source study name and then by gene. The numbers at the beginning of the folder names do not have significance and are only used for organisational purposes. Note both analyses types (codeml and baseml) use the same tree and nucleotide files. Each gene folder contains 6 files: 2 control files (1 codeml, 1 baseml - “.ctl”), 1 nucleotide file (“.nuc”), 1 tree file (“.trees”), 2 output files (1 codeml, 1 baseml).
1. Whole-tree dNdS and Rates.zip
2. dN/dS ratio sister-clades analysis
This folder contains the input and output files used in and obtained from the program PAML (Yang 2007) for the sister-clade analysis of dN/dS ratios (program component codeml). This folder is organised into subfolders by source study name and then by gene. The numbers at the beginning of the folder names do not have significance and are only used for organisational purposes. Each gene folder contains 4 files: 1 control file (“.ctl”), 1 nucleotide file (“.nuc”), 1 tree file (“.trees”), and 1 output file.
2. dNdS sister clades.zip
3. Relative rates sister-clades analysis
This folder contains the input and output files used in and obtained from the program Phyltest (Phyltest v.2.0, http://www.kumarlab.net/publications) for the sister-clade relative rates analysis. This folder is organised into subfolders by source study name and then by gene. The numbers at the beginning of the folder names do not have significance and are only used for organisational purposes. Transitions are identified by the set# in the file name (e.g. “Set1” or “S1” = transition #1, transitions as described in the Supplementary Material). All transitions were analysed with separate input and output files except for the transitions within the source study folder Cunha et al. 2011 where each transition was run from the same file per gene. Within the input files the abbreviations FL=flightless, F=flight-capable, OG=outgroup, and the number after these identifiers signifies the transition. Each gene folder (except where just described) contains 2 files per transition: 1 input file (nucleotide - “.dat”) and 1 output file.
3. Relative rates sister clades.zip