Data from: A new phylogeny of Anastrepha (Diptera: Tephritidae) based on nuclear loci obtained by phylogenomic methods
Data files
Sep 03, 2025 version files 910.95 MB
-
Anastrepha_GeneTrees_for_DS1_Loci.zip
13.12 MB
-
Anastrepha_PrunedDipKit90april24_AstralNT_species.tre
47.92 KB
-
AnastrephaAllLoci50pctJul24.tree
26.94 KB
-
AnastrephaDipKit90pruneApr24.tree
58.04 KB
-
AnastrephaNames_LabCodes_2025_08_02.csv
46.49 KB
-
DS1_Anastrepha_dipkit9042.partition.dna.txt.best_scheme.nex
21.33 KB
-
DS1_Anastrepha_dipkit90424_final.dna.phy
192.45 MB
-
DS2_Anastrepha_AllLoci50pct_90424_final.dna.phy
682.31 MB
-
DS2_final.raxml_partition.dna.txt
41.27 KB
-
Individual_Locus_Alignments.zip
22.82 MB
-
README.md
2.82 KB
Abstract
Phylogenetic alignments, datasets, and tree files from anchored hybrid enrichment (AHE) from Norrbom et al. for Anastrepha true fruit flies (Diptera: Tephritidae). The data sets include sequences from 293 (DS1) and 1110 (DS2) aligned orthologous loci sampled for 735 sampled specimens, with 728 identified as representatives of 237 species. Multiple sequence alignments were inferred using the program MAFFT. Phylogenetic data sets are formatted for use in IQTree2 and were used to reconstruct phylogenetic trees depicted in Figures 1 and 2, S1, S2 ,and S3 in the manuscript. This study presents a comprehensive phylogenetic analysis of the genus Anastrepha, utilizing anchored hybrid enrichment to sample hundreds of nuclear genes from a global collection of their diversity. Our results provide a robust and novel reconstruction of the evolutionary history of this group, allowing a better resolution of their lineages and a revised classification in species groups.
Dataset DOI: 10.5061/dryad.7d7wm3864
[article doi pending acceptance in Systematic Entomology]
Description of the data and file structure:
- The zip file “Indiviual_Locus_Alignments.zip“ contains multiple sequence alignments (MSAs) for all of the orthologous gene loci used in the study. Subfiles included here: “DS1_loci” includes the individual locus alignments used to construct concatenated phylogenetic data set 1 (DS1); “DS2_loci” includes the individual locus alignments used to construct concatenated phylogenetic data set 2.
- AnastrephaNames_LabCodes_2025_08_02.csv: The CSV file containing lab codes and their associated taxonomic names, collection localities, and specimen data used in the multiple sequence alignments (MSAs) and phylogenetic data sets. Each row in the data set includes the specimen labcode (used in alignments and trees as taxon identifier codes) and its associated metadata descriptor in the form of SpeciesGroupIdentifier_speciesNameIdentifier_localityCode_SpecimenIdentificationCode.
- DS1_Anastrepha_dipkit90424_final.dna.phy: The concatenated alignment referred to as data set 1 (DS1) that was used for the phylogenetic reconstruction depicted in Figures 1 and 2.
- DS1_Anastrepha_dipkit9042.partition.dna.txt.best_scheme.nex: The gene locus partitions and associated models corresponding to positions in the concatenated data set 1(DS1) used for phylogenetic analysis and model testing in IQTree2.
- DS2_Anastrepha_AllLoci50pct_90424_final.dna.phy: The concatenated alignment referred to as data set 2 (DS2) that was used for the phylogenetic reconstruction depicted in Figures S1.
- DS2_final.raxml_partition.dna.txt: This text file contains a list of the gene locus partitions corresponding to nucleotide positions in the concatenated data set 2 (DS2) used for phylogenetic analysis and model testing in IQTree2. Each line of the file is in the form "DNA, GeneLocusIdentifier nucleotide = nt position range in DS2". Example, DNA, sco_38052at7203 = 1-402.
- AnastrephaDipKit90pruneApr24.tree: Maximum likelihood topology tree file and branch lengths from IQTree2 in Newick format depicted in Figures 1 and 2 and Figure S1.
- Anastrepha_PrunedDipKit90april24_AstralNT_species.tre: Astral tree topology tree file and support values in Newick format depicted in Figure S2.
- “AnastrephaAllLoci50pctJul24.tree”: Maximum likelihood topology tree file and branch lengths from IQTree2 in Newick format depicted in Figure S3
- “Anastrepha_GeneTrees_for_DS1_Loci.zip”: The zip file includes all individual locus tree files calculated in RaXml-NG used in the coalescent-based Astral tree of figure S2.
