A revised classification of the assassin bugs (Hemiptera: Heteroptera: Reduviidae) based on combined analysis of phylogenomic and morphological data
Data files
Jul 21, 2024 version files 362.10 MB
-
ASTRAL_tree.txt
14.45 KB
-
Combined_molecular_data_partitions.prt
70.61 KB
-
COMBINED_NGS_SANGER.phy
360.80 MB
-
ML_tree_for_ASR.tre
11.72 KB
-
ML_tree.txt
12.33 KB
-
morph_matrix_for_ASR.csv
603.33 KB
-
NGS_SANGER_MORPH.best_models.nex
463.92 KB
-
README.md
5.99 KB
-
Reduviidae_morph_matrix.nex
68.62 KB
-
Reduviidae_morph_matrix.phy
28.22 KB
-
tip_names.csv
20.81 KB
Abstract
Assassin bugs (Hemiptera: Reduviidae Latreille) comprise one of the largest radiations of predatory animals (22 subfamilies; >6,800 spp.), but also include the medically important kissing bugs (Triatominae Jeannel). Reduviidae are morphologically diverse, engage in an astounding array of predatory strategies, and have evolved some of the most unique anti-predator and stealth techniques in the animal kingdom. While significant progress has been made to reveal the evolutionary history of assassin bugs and revise their taxonomy, the non-monophyly of the second largest assassin bug subfamily, Reduviinae Latreille, remains to be addressed. Leveraging phylogenomic data (2,291 loci) and 112 morphological characters, we performed the first data- and taxon-rich (195 reduvioid taxa) combined phylogenetic analysis across Reduvioidea and reconstructed morphological diagnostic features for major lineages. We corroborated the rampant polyphyly of Reduviinae that demands substantial revisions to the subfamilial and tribal classification of assassin bugs. Our new classification for Reduviidae reduces the number of subfamilies to 19 and recognizes 40 tribes. We describe three new subfamilies to accommodate distantly related taxa previously classified as Reduviinae. Triatominae sensu nov. are expanded to include closely related predatory reduviine genera. Cetherinae Jeannel, Chryxinae Champion, Pseudocetherinae Villiers, Salyavatinae Amyot & Serville, and Sphaeridopinae Amyot & Serville are treated as junior synonyms of Reduviinae sensu nov. Epiroderinae Distant are synonymized with Phimophorinae Handlirsch sensu nov. and Bactrodini Stål stat. nov. are reclassified as a tribe of Harpactorinae Amyot & Serville. Psophidinae Distant is treated as a valid subfamily. This new classification represents a robust framework for future taxonomic and evolutionary research on assassin bugs.
https://doi.org/10.5061/dryad.d51c5b0bn
Description of the data and file structure
COMBINED_NGS_SANGER.phy
This is the PHYLIP formatted concatenated alignment of 2,291 loci (totaling 1,803,947 bp) used as input for phylogenetic analysis in our study. These sequences were derived from several sources including hybrid capture (Anchored Hybrid Enrichment - AHE), OrthoMCL mining of low-coverage whole genomes, and Sanger sequencing of 3 nuclear ribosomal loci (18S, 28S D2-3, 28S D3-5) and 2 mitochondrial genes (16S and CO1). The AHE and OrthoMCL portions of the concatenated alignments in this file (totaling 2,286 loci) were processed and first analyzed by Knyshov et al. 2023 (https://doi.org/10.1093/molbev/msad168). See Supplementary Table 1 of our paper for more information on how sequence data was obtained for each voucher specimen and a list of accession numbers for raw Sanger sequences available on GenBank. Note: some taxon names in this file may differ from those presented in the figures of our paper; tip_names.csv provides a key that links original taxon names in the matrix to the updated names presented in the figures.
NGS_SANGER_MORPH.best_models.nex
This NEXUS formatted file provides partition position information for the 2,291 molecular loci in our concatenated alignment (COMBINED_NGS_SANGER.phy) and our morphological dataset (Reduviidae_morph_matrix.phy) as well as the best corresponding substitution models for each partition (as determined using ModelFinder in IQ-TREE). This file was used for initiation of phylogenetic reconstruction in IQ-TREE. These are specific commands we used to run the analysis: iqtree2 -s COMBINED_NGS_SANGER.phy -p NGS_SANGER_MORPH.best_models.nex -st DNA –alrt 1000 -B 1000 -allnni -pre iqtree -safe -T AUTO
Combined_molecular_data_partitions.prt
This file provides a map to the individual molecular partitions and their positions in our concatenated DNA alignment (COMBINED_NGS_SANGER.phy). 2,293 partitions are listed because COI is divided into 1st, 2nd, and 3rd codon positions. This file can be used to deconstruct the concatenated matrix into individual aligment files per locus if necessary.
Reduviidae_morph_matrix.nex / Reduviidae_morph_matrix.phy
These are NEXUS and PHYLIP formatted versions of our morphological matrix consisting of 112 characters coded for 200 taxa. In total, 34 head, 20 thoracic, 16 leg, 11 forewing, and 25 abdominal characters were scored using Mesquite v3.70. Images of select characters and their states are provided in figures 2–5 of our paper. The .phy version was used in conjunction with the molecular concatenated alignment (COMBINED_NGS_SANGER.phy) to perform a combined morphological + molecular analysis in IQ-TREE. The NEXUS formatted version provides more details and lists all of the character state codings for 112 morphological characters that we surveyed. Note: some taxon names in this file may differ from those presented in the figures of our paper; tip_names.csv provides a key that links original taxon names in the matrix to the updated names presented in the figures.
tip_names.csv
This file provides a key that links original taxon names in the matrixes/raw tree files to the updated names presented in the figures. The original name for each voucher used in our matrixes is given in the first column. The updated generic-level names and species epithets for each voucher applied in the figures are listed in columns “genus” and “species”. Biogeographic regions (“locality” column) corresponding to each specimen are provided as AT=Afrotropical, AU=Australian, NE=Nearctic, NT=Neotropical, OT=Oriental, PA=Palearctic. Lab specific unique specimen identifiers are listed in column “RCW”. N/As in the spreadsheet represent not applicable information (these occur among non-reduviid outgroup taxa).
ASTRAL_tree.txt
This is the raw species tree recovered from the ASTRAL analysis of the molecular dataset. Local posterior probability values are listed across internal nodes. Note: some taxon names in this file may differ from those presented in the figures of our paper; tip_names.csv provides a key that links original taxon names in the matrix to the updated names presented in the figures.
ML_tree.txt
This is the best Maximum Likelihood tree (of eight independent runs) recovered from the IQ-TREE analysis of our combined morphological (Reduviidae_morph_matrix.phy) + molecular (COMBINED_NGS_SANGER.phy) datasets. SH-aLRT branch test and Ultra-Fast bootstrap approximation support values are listed across internal nodes. Note: some taxon names in this file may differ from those presented in the figures of our paper; tip_names.csv provides a key that links original taxon names in the matrix to the updated names presented in the figures.
ML_tree_for_ASR.tre
This file contains the same ML tree (ML_tree.txt) recovered using IQ-TREE, but with updated taxon names that reflect those used in the main figures and the ancestral state reconstructions. This file was used in conjunction with morph_matrix_for_ASR.csv morphological matrix file to preform ASR with the ‘ace’ function of the package ape.
morph_matrix_for_ASR.csv
This .csv contains the morphological matrix as formatted for ancestral state reconstruction in R. The 200 taxa we coded are listed in the first column. The 112 discrete characters (starting with “Char_0…”) are listed in the first row. Each cell in the matrix details all of the individual character states per taxon. This file was used in conjunction with ML_tree_for_ASR.tre tree file to preform ASR with the ‘ace’ function of the package ape.
Please refer to the Material and Methods section of our paper for details regarding construction of the morphological matrix, molecular data generation and processing, phylogenetic analyses, and ancestral state reconstruction.