Data from: Phylogenomics recovers multiple origins of portable case-making in caddisflies (Insecta: Trichoptera), nature’s underwater architects
Data files
Jun 07, 2024 version files 117.59 GB
Abstract
Caddisflies (Trichoptera) are among the most diverse groups of freshwater animals with more than 16,000 described species. They play a fundamental role in freshwater ecology and environmental engineering in streams, rivers, and lakes. Because of this, they are frequently used as indicator organisms in biomonitoring programs. Despite their importance, key questions concerning the evolutionary history of caddisflies, such as the timing and origin of larval case-making, remain unanswered due to the lack of a well-resolved phylogeny. Here, we estimated a phylogenetic tree using a combination of transcriptomes and targeted enrichment data for 207 species, representing 48 of 52 extant families and 174 genera. We calibrated and dated the tree with 33 carefully selected fossils. The first caddisflies originated ~295 million years ago in the Permian, and major suborders began to diversify in the Triassic. Further, we show that portable case-making evolved in three separate lineages, and shifts in diversification occurred in concert with key evolutionary innovations beyond case-making.
README: Phylogenomics recovers multiple origins of portable case-making in caddisflies (Insecta: Trichoptera), nature’s underwater architects
https://doi.org/10.5061/dryad.q2bvq83sk
This repository houses supporting data for "Phylogenomics recovers multiple origins of portable case-making in caddisflies (Insecta: Trichoptera), nature's underwater architects."
Description of the data and file structure
Note: N/As in Table S1 indicate that the data are not uploaded to that particular repository and blank cells indicate that the data do not have accession numbers for those repositories. The N/A in Table S4 indicates that the node was not recovered in that particular analysis.
The contents of this repository are as follows:
Supplementary_tables.xlsx
contains Tables S1-S12
S1: Taxon list
S2: Taxon stats
S3: Contamination removal
S4: Support values for key branches
S5: Taxon sets for four-cluster likelihood analysis
S6: Fossil calibration info
S7: Estimated ages of key clades
S8: Probabilities of ancestral states for key nodes
S9: Probabilities of ancestral states for all nodes in the transcriptome-only analysis
S10: Probabilities of ancestral states for all nodes in the combined analysis
S11: Clade-specific sampling proportions for BAMM analysis
Supplementary_archive_1.tar.gz
contains the raw sequence data (in merged and unmerged FASTQ files) for the targeted enrichment runs outlined in Table S1. There is a separate folder for each species containing raw reads in FASTQ files. Files that include an "M" contain merged paired-end reads, and FASTQ files that include a "U1" or "U2" contain unmerged forward (U1) and reverse (U2) paired-end reads. See the README file within the supplemental folder to view species' sequencing codes.
Supplementary_archive_2.zip
contains the alignments, gene boundaries, and models used for the transcriptome-only and combined analyses.
Explanation of files:
Transcriptome only:
- FcC_smatrix.transcriptome_only.fas: Alignment of transcriptome-only dataset
- transcriptome_gene_boundaries.txt: gene boundaries of transcriptome-only dataset
- MFP_transcriptome_only.best_scheme.nex: models and partitioning scheme for transcriptome-only dataset
Combined dataset:
- FcC_smatrix_combined.fas: Alignment of combined dataset
- combined_gene_boundaries.txt: gene boundaries of combined dataset
- MFP_merge.combined.best_scheme.next: models and partitioning scheme for the combined dataset
Supplementary_archive_3.zip
contains the trees resulting from the Astral-III analysis, IQ-TREE analyses, and dating analyses
Explanation of files:
- transcriptome_gene_trees: folder with individual transcriptome gene trees
- astral.tre: ASTRAL-III species tree from transcriptome genes trees
- CAUCHY_ReducedP_renamed.tree: MCMCtree generated combined dated phylogeny with Cauchy priors
- UNIFORM_ReducedP_renamed.tree: MCMCtree generated combined dated phylogeny with uniform priors
- aa_MFP_merge_combined.best.tre: best ML tree from combined analysis (used as the topological constraint for MCMCtree)
- mfp_transcriptome.best.tre: best ML tree from transcriptome-only analysis
- treepl_misse: Folder containing the treePL dated trees for the 25 best trees from the maximum likelihood analyses of the combined data along with the MiSSe results for each tree.
Supplementary_archive_4.zip
contains the resulting files from the Four-cluster Likelihood Mapping (FcLM) analyses