UCE phylogenomics illuminate the evolutionary history and biogeography of Dorymyrmex pyramid ants
Data files
Apr 30, 2025 version files 260.15 MB
-
BioGeoBEARS-input-78taxa-Dory-chronogram.tre
3.69 KB
-
BioGeoBEARS-input-Ecoregion-matrix.txt
2.83 KB
-
ExaBayes-input-LFD-185t-870loci-partitionmodels.best_scheme
40.64 KB
-
ExaBayes-input-LFD-185t-CONFIG_exb_11.conf
5.40 KB
-
ExaBayes-output-LFD-185t_870loci-122parts_Mar26names_PHYLO.contree
18.08 KB
-
ExaBayes-output-LFD-185t-870loci-122parts-STDOUT.txt
141.05 KB
-
IQtree-output_D-Cono-150t_134partitions_IQtreeModelFinderModels_MaxLikelihood.contree
15.12 KB
-
IQtree-output_LFD-185t-870loci-122parts-IQT-Mar26names_MaxLikelihood.contree
18.16 KB
-
MATRIX-D-Cono-150t-part_134parts_IQtreeModelFinder.best_scheme.nex
57.32 KB
-
MATRIX-D-Cono-150t-part_134parts_IQtreeSymtestGood-spruced-up.phy
131.23 MB
-
MATRIX-LFD-185t-part_870loci-122partitions-IQtreeModelFinder.best_model.nex
57.22 KB
-
MATRIX-LFD-185t-part_870loci-IQtreeSymtestGood-spruced-up.phy
128.53 MB
-
MCMCtree-input-2calibs-x100-Mar23names.tre
5.10 KB
-
MCMCtree-input-CONFIG-rgene-sigma2-scaled10.ctl
2.83 KB
-
MCMCtree-output-185taxa-chronogram.tre
14.30 KB
-
README.md
4.60 KB
Abstract
Latitudinal diversity gradients are one of the most widely discussed patterns in global biogeography, generally in the context of high diversity in tropical regions. In contrast, "amphitropical" or "inverse" distributions, once thought to be unusual, are increasingly recognized as common among many hymenopteran insects. One such group is the ant genus Dorymyrmex, which specializes in arid habitats throughout the Americas. To evaluate when and how Dorymyrmex acquired its present-day distribution, I sequenced partial genomes of 167 Dorymyrmex representing 69 species by targeting ultra-conserved elements (UCEs). A matrix of 870 genetic loci was used to infer maximum likelihood and Bayesian phylogenies, estimate divergence dates, and reconstruct hypothesized ancestral areas. These new analyses reveal that Dorymyrmex comprises four species groups, the D. flavescens, tener, wolffhuegeli, and pyramicus groups. The D. pyramicus group likely dispersed from South America to North America only once, via Central America. Like many Hymenoptera, this dispersal occurred before the traditional closure date of the Isthmus of Panama, corroborating and extending the results of previous studies. Finally, I discuss life history strategies of Dorymyrmex that may have contributed to the geographic and genetic radiation of the D. pyramicus group, detail significant insights into Dorymyrmex morphology and classical taxonomy with new comparative illustrations, and provide recommendations for future work.
https://doi.org/10.5061/dryad.d7wm37q9c
Description of the data and file structure
Targeted genomic sequence data (UCEs) were assembled into contigs in SPAdes and assembled into matrices in phyluce. Each matrix was aligned in MAFFT and downsampled for computational efficiency by removing loci that violate evolutionary model assumptions using IQ-TREE --symtest-remove-bad. Matrices were cleaned in the program spruceup to trim out poor quality sequence regions, then partitioned using the SWSC-EN algorithm. Partitions were merged in IQ-TREE using ModelFinder -m TESTMERGEONLY. Resulting matrices are LFD-185t-part and D-Cono-150t-part.
Maximum likelihood phylogenies were inferred in IQ-TREE. Bayesian phylogenies were inferred in ExaBayes on the CIPRES Science Gateway. Divergence times were estimated in MCMCtree and visualized in MCMCtreeR. Biogeographic history (ancestral areas) were estimated in BioGeoBEARS.
Files and variables
File: MATRIX-D-Cono-150t-part_134parts_IQtreeSymtestGood-spruced-up.phy
Description: Multiple sequence alignment for the 150-taxon ingroup "Conomyrma", with partitions split using SWSC-EN and merged with IQ-TREE ModelFinder
File: MATRIX-LFD-185t-part_870loci-IQtreeSymtestGood-spruced-up.phy
Description: Multiple sequence alignment for the 185-taxon total clade "Leptomyrmex+Forelius+Dorymyrmex", with partitions split using SWSC-EN and merged with IQ-TREE ModelFinder
File: BioGeoBEARS-input-78taxa-Dory-chronogram.tre
Description: Chronogram pruned to 78 species-level lineages for input into BioGeoBEARS
File: BioGeoBEARS-input-Ecoregion-matrix.txt
Description: Ecoregions inhabited by each taxon, for input into BioGeoBEARS
File: ExaBayes-input-LFD-185t-CONFIG_exb_11.conf
Description: Configuration file for ExaBayes
File: ExaBayes-output-LFD-185t_870loci-122parts_Mar26names_PHYLO.contree
Description: Output phylogeny from ExaBayes
File: ExaBayes-input-LFD-185t-870loci-partitionmodels.best_scheme
Description: Partition file for data matrix LFD-185t-part, for input into ExaBayes
File: IQtree-output_D-Cono-150t_134partitions_IQtreeModelFinderModels_MaxLikelihood.contree
Description: Consensus maximum likelihood phylogeny of dataset D-Cono-150t-part (IQ-TREE)
File: IQtree-output_LFD-185t-870loci-122parts-IQT-Mar26names_MaxLikelihood.contree
Description: Consensus maximum likelihood phylogeny of dataset LFD-185t-part (IQ-TREE)
File: MCMCtree-input-2calibs-x100-Mar23names.tre
Description: Input cladogram for MCMCtree, with topology from ExaBayes output
File: MCMCtree-input-CONFIG-rgene-sigma2-scaled10.ctl
Description: Configuration file for MCMCtree
File: MATRIX-D-Cono-150t-part_134parts_IQtreeModelFinder.best_scheme.nex
Description: Partition file for the data matrix D-Cono-150t-part, with partitions split using SWSC-EN and merged with IQ-TREE ModelFinder
File: MATRIX-LFD-185t-part_870loci-122partitions-IQtreeModelFinder.best_model.nex
Description: Partition file for the 185-taxon data matrix LFD-185t-part, with partitions split using SWSC-EN and merged with IQ-TREE ModelFinder
File: MCMCtree-output-185taxa-chronogram.tre
Description: MCMCtree output chronogram
File: ExaBayes-output-LFD-185t-870loci-122parts-STDOUT.txt
Description: Output log file from ExaBayes
File: BioGeoBEARS-output-model-comparison-results.txt
Description: Comparison of AIC and LRT scores for all models tested in BioGeoBEARS
Code/software
Data wrangling:
- phyluce v1.7.3
- illumiprocessor v2.10
- trimmomatic v0.39
- MAFFT v7.475, L-INS-i algorithm
- IQ-TREE v2.2.3, --symtest-remove-bad
Analysis:
- IQ-TREE v2.1.2, ModelFinder
- AMAS (alignment statistics calculation)
- ExaBayes v1.5.1, -M 3, on the CIPRES Science Gateway
- consense (ExaBayes utility)
- MCMCtree, in PAML v4.10.3
- BioGeoBEARS (in R)
Visualization:
- Tracer
- FigTree
- MCMCtreeR (in R)
Access information
Other publicly accessible locations of the data:
- Raw sequence reads: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA759281/
Usage notes
- Data, config files, and partition files can be viewed using a standard plain text editor.
- Tree files (.tre, .contree) can be viewed in FigTree.
- ExaBayes...STDOUT.txt can be viewed in Tracer.