Data from: Untangling the ant claws: The army ant (Formicidae: Dorylinae) Labidus mars is a Neivamyrmex
Data files
Mar 07, 2025 version files 413.23 MB
-
DNA-Matrices.zip
8.78 MB
-
README.md
2.86 KB
-
sample-names.xlsx
9.74 KB
-
Trees-and-Log-Files.zip
429.04 KB
-
UCE-Alignments.zip
5.11 MB
-
UCE-Contigs.zip
398.89 MB
Abstract
The New World army ants (Hymenoptera: Formicidae) comprise the five genera of the Eciton species group, and together they are important keystone predators in tropical and subtropical environments. Generic boundaries in the group have been considered solid and stable for nearly 100 years. Workers of the widespread and diverse genus Neivamyrmex are readily separable from the other four genera by lacking a subapical tooth on the tarsal claw, while males can be separated with genitalic characters. The genus Labidus is also widespread and is often abundant, with several species that are conspicuous surface foragers. The least known species of Labidus is L. mars, the workers of which have the tarsal tooth but otherwise share many traits with some Neivamyrmex, being completely eyeless and subterranean. This led us to question its generic placement. Here, we used ultraconserved element (UCE) phylogenomics to show that Labidus mars belongs to the genus Neivamyrmex. All phylogenies, inferred using multiple partitioning schemes and a species tree analysis, recovered the same topology, placing Labidus mars workers within Neivamyrmex. Sequenced males of L. mars were found to be within Labidus and thus incorrectly associated with L. mars. Based on these results and review of key specimens, including types, the following taxonomic changes are made: Neivamyrmex mars (Forel, 1912) is a new combination; Labidus nero (Santschi, 1930) (rev. stat.) is a male-based taxon revived from synonymy under L. mars; and L. denticulatus (Borgmeier, 1955) (new stat.), a male-based taxon and former subspecies of L. mars, is raised to species.
https://10.5061/dryad.bg79cnpkt
Sample names
DNA extraction codes, original sample names, final sample names, and NCBI accession numbers for all samples used in this study.
File name: sample-names.xlsx
UCE contigs
Aligned UCE contigs for all samples used in this study. Names have been updated to the final published sample names.
File name: uce-contigs.zip
File list:
Cheliomyrmex_megalonyx_M146.contigs.fasta
Eciton_lucanoides_EX3147.contigs.fasta
Labidus_mars_EX3594.contigs.fasta
Labidus_denticulatus_EX3300.contigs.fasta
Labidus_coecus_EX3411.contigs.fasta
Labidus_praedator_EX3412.contigs.fasta
Labidus_spininodis_M172.contigs.fasta
Leptanilloides_femoralis_M188.contigs.fasta
Neivamyrmex_bruchi_EX3152.contigs.fasta
Neivamyrmex_clavifemur_EX3302.contigs.fasta
Neivamyrmex_mars_EX3593.contigs.fasta
Neivamyrmex_nigrescens_D3085.contigs.fasta
Neivamyrmex_pilosus_D3087.contigs.fasta
Neivamyrmex_punctaticeps_EX3414.contigs.fasta
Neivamyrmex_spoliator_EX3324.contigs.fasta
UCE alignments
Raw UCE alignment FASTA files. 2030 UCE loci aligned using mafft-linsi, but not not yet trimmed or filtered. Names have been updated to the final published sample names.
File name: uce-alignments.zip
DNA matrices
DNA matrix and partition files used in phylogenetic analyses. The matrix contains 2030 aligned, filtered (matrix generated with Phyluce for 80% completeness), and trimmed (using GBlocks) UCE loci, includes by-locus character partitions. Names have been updated to final published names.
File name: dna_matrices.zip
File list:
concat-align-nexus-gblocks-names-mars-80.nex
bylocus_partitions.nex
swsc_partitions.nex
Tree and log files
Tree files with boostrap scores/local posterior probabilities and associated analysis log files. Sample names in trees, but not log files, have been updated to final published sample names. The first part of each file name indicates the figure number used in the publication, with most numbers (s1, s2, s3) corresponding to supplemental figure numbers.
fig1-swsc.log and fig1-swsc.tre: UCE matrix partitioned using entropy-based Sliding-Window Site Characteristics (SWSC-EN); s1-BS30.tre, s1-astral.log, and s1-astral.tre: gene trees with branches with ≤30% bootstrap support collapsed, ASTRAL species tree with local posterior probabilities; s2-bylocus.log, s2-bylocus.tre: tree from UCE matrix partitioned by UCE locus; s3-unpartitioned-gtrg.log, s3-unpartitioned-gtrg.tre: tree from unpartitioned UCE matrix.
File name: tree-log-files.zip
File list: fig1-swsc.log
fig1-swsc.tre
s1-BS30.tre
s1-astral.log
s1-astral.tre
s2-bylocus.log
s2-bylocus.tre
s3-unpartitioned-gtrg.log
s3-unpartitioned-gtrg.tre
Here we have deposited contigs representing UCE loci, unfiltered UCE alignments, the concatenated UCE matrix, tree files, and additional data analysis files (partitioning schemes and log files). All new sequence data have been deposited in the NCBI Sequence Read Archive under BioProject#PRJNA1158819. The methods used to generate these data are described below and in the article.
Molecular Taxon Sampling:
UCE sequence data were newly generated for 12 specimens and combined with published data available from the Dryad repository referenced in Borowiec (2019) for 3 specimens (Table 1). Leptanilloides femoralis Borowiec & Longino 2011 was used as an outgroup following Borowiec (2019). Neivamyrmex specimens were selected using the morphological similarities between Labidus mars and some Neivamyrmex species, using the key to Neivamyrmex species in Watkins (1976) and information from a current project on the evolution of Neivamyrmex (S. Powell & L. Barros, personal communication). The six Neivamyrmex selected represent 5% of the total described diversity of the genus.
DNA Sequence Generation:
We employed the UCE approach to phylogenomics (Faircloth et al. 2012, Faircloth et al. 2015, Branstetter et al. 2017), combining target enrichment of UCEs with multiplexed next-generation sequencing. All samples were extracted, quality-checked, and sheared, following the UCE methodology described in Branstetter et al. (2017). For two samples, the sheared DNA was shipped to Rapid Genomics (RG; Gainesville, FL) for library preparation, UCE enrichment and sequencing. For each library, RG aimed to sequence two million paired-end (PE) reads of enriched library plus one million PE reads of unenriched library. Enrichment was performed using the ant-customized version (“ant-specific hym-v2”) of the Hymenoptera v2 bait set, which targets 2,524 UCE loci common across Hymenoptera (Branstetter et al. 2017). Sequencing was performed on Illumina NovaSeq instruments (2x150). For eight samples, the sheared DNA was sent to the University of Utah Genomics Core Facility for processing, following the same general approach as RG, with both enriched and unenriched libraries being sequenced separately. Finally, for two additional samples, library preparation and enrichment was performed in-house following methods outlined in Branstetter et al. (2021) and using an ant-bee version of the bait set (Grab et al. 2019). These samples were sent to Novogene (Sacramento, CA) for HiSeq X sequencing (2x150) and only enriched libraries were sequenced. The utility of the Hym-v2 bait set to resolve relationships, both deep and shallow, in ants has been demonstrated in several studies (Barrera et al. 2022, Branstetter et al. 2017, Branstetter & Longino 2019, Pierce et al. 2017, Ward & Branstetter 2017, Williams et al. 2022).
All new sequence data have been deposited in the NCBI Sequence Read Archive under BioProject#PRJNA1158819.
UCE Matrix Assembly:
Sequences from RG and UU were demultiplexed by the providers and reads from Novogene were manually demultiplexed using BBTools (Bushnell 2014). For the RG and UU samples, we combined the enriched reads with the unenriched reads before read trimming and further processing. For all newly sequenced samples, the sequence data were cleaned, assembled, and aligned using the Phyluce package 1.7 (Faircloth 2016). Within the Phyluce environment, we used Illumiprocessor (Faircloth 2013) and Trimmomatic (Bolger et al. 2014) for quality trimming raw reads, SPAdes v3.14.1 (Bankevich et al. 2012) for de novo assembly of reads into contigs, and LASTZ v1.02 (Harris 2007) for identifying UCE contigs from all contigs. All optional Phyluce settings were left at default values for these steps, as these settings have been found to work well in similar datasets. For the bait sequences file needed to identify and extract UCE contigs, we used the ant-specific hym-v2 bait file. To calculate assembly statistics, including sequencing coverage, we used scripts from the Phyluce package. For the published data from Borowiec (2019), UCE contigs were downloaded from Dryad (https://doi.org/10.5061/dryad.bj83h5d).
After extracting UCE contigs, we aligned each UCE locus using a stand-alone version of the program MAFFT v7.130b (Katoh & Standley 2013) and the L-INS-i algorithm. We then used a Phyluce script to trim flanking regions and poorly aligned internal regions using the program Gblocks (Talavera & Castresana 2007). The program was run with reduced stringency parameters (b1:0.5, b2:0.5, b3:12, b4:7). We then used another Phyluce script to filter the initial set of alignments so that each alignment was required to include data for 80% of taxa. Additional matrices with 70, 90, and 100% taxon completeness were used for data exploration, but the 80% matrix was selected for most analyses as it reduced missing data without resulting in significant loss of loci. To calculate summary statistics for the final data matrix, we used a script from the Phyluce package (phyluce_align_get_align_summary_data).
UCE Phylogenomic Analysis: We performed several analyses on the 80% concatenated supermatrix using the maximum likelihood-based program IQ-TREE v2 (Minh et al. 2020). An unpartitioned search was conducted using the GTR+G model of sequence evolution as an initial check of tree topology. Following this result, we partitioned the tree by UCE locus and conducted separate analyses using IQ-TREE v2, first merging loci using “-m TESTMERGEONLY” and “--merge rclusterf -rclusterf 10”, then conducting a second search with the resulting partitioning scheme using model selection from ModelFinder (Kalyaanamoorthy et al. 2017).
The supermatrix was also partitioned using Sliding-Window Site Characteristics based on entropy (SWSC-EN; Tagliacollo & Lanfear 2018), which breaks UCE loci into three regions, corresponding to the right flank, core, and left flank. The theoretical underpinning of the approach comes from the observation that UCE core regions are conserved, while the flanking regions become increasingly more variable (Faircloth et al. 2012). After running SWSC-EN, the resulting data subsets were merged using “-m TESTMERGEONLY” and “--merge rclusterf -rclusterf 10”, and phylogenetic relationships were inferred from the resulting partitioning scheme using model selection from ModelFinder in IQ-TREE v2. For all IQ-TREE runs on the concatenated matrix, we performed 1,000 replicates of the ultrafast bootstrap approximation (UFB; Hoang et al. 2018) and 1,000 replicates of the branch-based, Shimodaira-Hasegawa-like approximate likelihood ratio test (SH-aLRT; Guindon et al. 2010).
To address possible topological conflict between UCE loci due to incomplete lineage sorting, a coalescence-based phylogeny was reconstructed using ASTRAL-III v5.7.8 (Zhang et al. 2018). We first created a set of gene trees for the set of 2,030 UCE loci using IQ-TREE v2. For each gene tree analysis we performed model selection with ModelFinder (option “-m MFP”), using the AICc model selection criterion, and conducted 1,000 UFB replicates. Once the gene trees were generated, we followed the recommendation of Zhang et al. (2018) and used Newick Utilities (Junier & Zdobnov 2010) to collapse branches with ≤ 30% UFB support. Using the modified gene trees, we performed a standard ASTRAL analysis, leaving all terminals as separate entities, and assessing support as local posterior probabilities.
