Delimitation of tribes in the subfamily Leptanillinae (Hymenoptera: Formicidae), with a description of the male of Protanilla lini Terayama, 2009
Griebenow, Zachary (2020), Delimitation of tribes in the subfamily Leptanillinae (Hymenoptera: Formicidae), with a description of the male of Protanilla lini Terayama, 2009, Dryad, Dataset, https://doi.org/10.25338/B8490T
The subfamily Leptanillinae Emery, 1910 (Hymenoptera, Formicidae) is a clade of cryptic subterranean ants, which is restricted to the tropics and warm temperate regions of the Old World. Due to acquisition bias against the minute and hypogaeic workers, most known leptanilline specimens are male, with four genera described solely from males. The sexes have been associated in only two out of 68 described species, meaning that redundant naming of taxa is likely. Herein the phylogeny of the Leptanillinae, sampled with emphasis on largely undescribed male material, is inferred from ultra-conserved elements (UCEs) using maximum-likelihood inference. This method associates the male of Protanilla lini Terayama, 2009 with corresponding workers collected on Okinawa-jima, Japan, allowing the first published description of male ants belonging to the Anomalomyrmini Taylor, 1990, one of the two established tribes within the Leptanillinae. The first male-based diagnosis of these tribes is provided, along with a dichotomous key to both all described male-based species within the Leptanillinae and male morphospecies sequenced in this study. With genome-scale data enabling the association of separately collected sexes and phylogenomic inference contextualizing morphological observations, the parallel taxonomy that afflicts this enigmatic group of ants can begin to be resolved.
DNA was extracted non-destructively from 39 leptanilline specimens, plus Martialis heureka as an outgroup, using a DNeasy Blood & Tissue Kit (Qiagen Inc., Valencia, CA) according to manufacturer instructions. Genomic concentrations were quantified for each sample with a Qubit 2.0 fluorometer (Life Technologies Inc., Carlsbad, CA). Phylogenomic data were generated using the UCE probe set hym-v2 (Branstetter et al. 2017), with libraries being prepared and target loci enriched using the protocol of Branstetter et al. (2017). Enrichment success and size-adjusted DNA concentrations of pools were assessed using the SYBR FAST qPCR kit (Kapa Biosystems, Wilmington, MA) and all pools were combined into an equimolar final pool. The contents of this final pool were sequenced on an Illumina HiSeq 2500 at the High Throughput Genomics Facility, University of Utah, Salt Lake City, UT or an Illumina HiSeq 4000 at Novogene, Sacramento, CA.
The FASTQ output was demultiplexed and cleansed of adapter contamination and low-quality reads using illumiprocessor (Faircloth 2013) in the PHYLUCE package. Raw reads were assembled with trinity v. 2013-02-25 (Grabherr et al. 2011) or with SPAdes v. 3.12.0 (Bankevich et al. 2013). All PHYLUCE commands hereinafter are cited from Faircloth (2016). Species-specific contig assemblies were obtained with the ant-specific hym-v2 probe set (Branstetter et al. 2017) using phyluce_assembly_match_contigs_to_probes.py (min_coverage=80), with min_identity=90 to minimize the influence of possible contamination; and a list of UCE loci shared across all taxa was generated using phyluce_assembly_get_match_counts.py, and separate FASTA files for each locus were created using these outputs. Sequences were aligned separately by locus using MAFFT L-INS-i (Katoh & Toh 2008), rather than the default version of MAFFT implemented in phyluce, implemented with the command phyluce_assembly_seqcap_align.py. These sequences were then trimmed with Gblocks (Castresana 2000) as implemented by the wrapper script phyluce_assembly_get_gblocks_trimmed_alignment_from_untrimmed.py (settings: b1=0.5, b2=0.5, b3=12, b4=7). Alignment statistics for the output FASTA files were calculated with phyluce_align_get_align_summary_data.py. Finally, a dataset that was 90% complete with respect to taxon coverage per locus was generated using the script phyluce_align_get_only_loci_with_min_taxa.py, consisting of 580 loci. The final alignment was then concatenated and converted to PHYLIP format with phyluce_align_format_nexus_files_for_raxml and was 368,656 bp in length, with 19.07% missing data, 135,832 parsimony-informative sites, and mean locus length being 634 bp.
National Science Foundation, Award: DEB-1932405