Phylogeny of Rhus gall aphids (Hemiptera: Aphididae) reveals an earlier origin than their primary host plant
Data files
Mar 21, 2025 version files 57.50 MB
-
mtDNA_alignment.nex
559.86 KB
-
mtDNA_BItree.tre
22.50 KB
-
mtDNA_MLtree.tre
7.59 KB
-
Plant_alignment.nex
960.38 KB
-
Plant_BItree.tre
8.13 KB
-
Plant_dating.tre
928 B
-
Plant_MLtree.treefile
508 B
-
README.md
2.91 KB
-
SCN_alignment.nex
55.91 MB
-
SCN_genetree_dating.tre
6.52 KB
-
SCN_MLtree.contree
6.16 KB
-
SCN_speciestree_dating.tre
3.10 KB
-
SCN_speciestree.tre
5.88 KB
Abstract
Rhus gall aphids (Hemiptera: Aphididae: Eriosomatinae: Fordini) only use Rhus species (Anacardiaceae) as their primary host plants, and each aphid specifically lives on only one or two sister host plants with the same disjunct distribution pattern between Eastern Asia and Eastern North America. We assembled complete mitochondrial genomes and universal single-copy nuclear genes for Rhus gall aphids using a genome skimming method, and the phylogenetic relationships strongly supported the monophyly of the Rhus gall aphids and two genera, Floraphis and Melaphis. However, the relationships among genera are inconsistent across different datasets. The cpDNA analysis on the host plant Rhus species strongly supported the Rhus monophyly and relationship of the Rhus interspecies. Cophylogeny analysis indicated that the origin of Rhus gall aphids was earlier than that of their host plants. However, the divergence time and relationships among some Rhus gall aphid species, particularly those with later divergence times, were consistent with the origin of their corresponding primary host plants. This may suggest that the Rhus gall aphids established the initial association with the ancestors between Rhus and the related groups, or acquired current Rhus hosts through host switching. The divergence time estimations imply that the separation of North America and Eurasia in the Laurasia supercontinent and the disappearance of the Bering Land Bridge, respectively, have played an important role in the divergence of the eastern North American Melaphis and an East Asian lineage. Our current results provide new insights into the coevolution of insects and host plants.
https://doi.org/10.5061/dryad.sxksn03c2
Description of the data and file structure
The dataset includes the original alignments, tree files, and divergence time estimation results based on the aphid mitochondrial genome, single-copy nuclear genes, and host plant chloroplast genome for phylogenetic analysis.
Here are the list of the files and respective details.
a) Plant_BItree in TRE format, the original tree file from Bayesian Inference (BI) analysis using MrBayes v3.2.7 for 69 chloroplast protein-coding genes of 17 species of Rhus and its closely related species.
b) Plant_dating in TRE format, the original file for divergence time estimation using the mcmctree program in PAML for 69 chloroplast protein-coding genes of 17 species of Rhus and its closely related species.
c) Plant_MLtree in TREEFILE format, the original tree file from Maximum Likelihood (ML) analysis using RAxML 8.2.12 for 69 chloroplast protein-coding genes of 17 species of Rhus and its closely related species.
d) SCN_genetree_dating in TRE format, the original file for divergence time estimation using the mcmctree program in PAML after constructing a tree with SCN_alignment using RAxML 8.2.12.
e) SCN_MLtree in CONTREE format, the original tree file constructed using SCN_alignment with RAxML 8.2.12.
f) Plant_alignment in NEX format, containing a concatenated alignment of 69 chloroplast protein-coding genes for 17 species of Rhus and its closely related species, with detailed parameters from the MrBayes v3.2.7 run at the end of the file.
g) SCN_speciestree in TRE format, the original file for the species tree constructed from amino acids of single-copy nuclear genes for 44 aphids using ASTRAL-III software.
h) SCN_speciestree_dating in TRE format, the original file for divergence time estimation using the SCN_speciestree.TRE file with the mcmctree program in PAML.
i) SCN_alignment in NEX format, containing a concatenated alignment of amino acids for single-copy nuclear genes from 44 aphids.
j) mtDNA_alignment in NEX format, containing a concatenated alignment of mitochondrial 13 protein-coding genes + 2 rRNA genes for 44 species of Rhus gall aphids and outgroups, with detailed parameters from the MrBayes v3.2.7 run at the end of the file.
k) mtDNA_BItree in TRE format, the original tree file from BI using MrBayes v3.2.7 for mitochondrial 13 protein-coding genes + 2 rRNA genes of 44 species of Rhus gall aphids and outgroup species.
l) mtDNA_MLtree in TRE format, the original tree file from ML analysis using RAxML 8.2.12 for mitochondrial 13 protein-coding genes + 2 rRNA genes of 44 species of Rhus gall aphids and outgroup species.
We collected 45 samples representing 14 Rhus gall aphid species and three subspecies and some closely related aphids. We extracted genomic DNA using three to five individuals from each gall sample with the TIANamp Genomic DNA Kit (TIANGEN) following the manufacturer’s instructions. We sent the genomic DNAs of all the samples to the Genomic Sequencing and Analysis Co. (Shanghai, China) for library construction and sequencing using the shotgun genome skimming method on an Illumina HiSeq 4000 platform. Paired-end reads of 2×150 bp were generated with an insert size of 400 bp. After filtering out low-quality and adapter-contaminated reads with Trimmomatic v. 0.35 , high-quality clean reads were used to do de novo assemblies using the program SPAdes v.3.10.1 with the kmers 21, 33, 55, 77, 99 and 127. We used the mapping method in the software Geneious 10.2.4 to obtain the complete mitochondrial genome from contigs (Kearse et al. 2012). We obtained the complete mitogenome sequence by aligning, comparing and manually checking to the available consensus sequences. We captured single-copy nuclear genes directly from contigs, which was assembled by SPAdes v.3.10.1, and using OrthoDB database of Hemiptera with default parameters. Then we used MAFFT for alignment and obtained the alignment results.