Data from: Whole mitochondrial genome phylogeny of the family Drosophilidae
Data files
Apr 03, 2025 version files 8.30 MB
-
241_mtDNA_B.txt
4.61 MB
-
Accession_numbers.docx
11.47 KB
-
Assemblies_Dros_mtDNA.txt
613.15 KB
-
Drosophila_mt.cat
3.07 MB
-
README.md
848 B
Abstract
A total of 241 mitochondrial genomes were assembled and annotated from the SRA database to reconstruct a mtDNA genome phylogeny for the genus Drosophila, the family Drosophilidae, and close relatives. The resulting mtDNA genome phylogeny is largely congruent with previous higher-level analyses of Drosophila species with the exception of the relationships between the melanogaster, montium, anannassae, saltans and obscura groups. Although relationships within these species groups are congruent between nuclear and mtDNA studies, the mtDNA genome phylogeny of the groups is different when compared to earlier studies. Monophyly of known species groups within the genus Drosophila are highly supported and, as in previous work, the genera Lordiphosa, Hirtodrosophila, Zaprionus and Scaptomya are all imbedded within the genus Drosophila. Incongruence and partitioned support analyses indicate that DNA sequences are better at resolving the phylogeny than their translated protein sequences. Such analyses also indicate that genes on the minus strand of the circular molecule (Lrrna, Srrna, ND4, ND4L and ND5) provide most of the support for the overall phylogenetic hypothesis.
- Drosophila_mt.cat: A txt file holding 199 assembled drosophilid mtDNA genomes. These assembled genomes are described in DeSalle, Rob, Sara Oppenheim, and Patrick M. O’Grady. “Whole mitochondrial genome phylogeny of Drosophilidae.” Mitochondrial DNA Part A 33, no. 1-8 (2022): 1-9.
- Assemblies_Dros_mtDNA.txt: A text file holding mtDNA genome assemblies, annotations and accession lists of species used in the study.
- Accession numbers.docx: A microsoft word file holding all of the accession information for the drosophilid species in the study.
- 241_mtDNA_B.txt: A txt file holding the aligned assemblies for 241 drosophilid species in NEXUS format, fully partitioned into genes, proteins, tRNA’s, rRNAs and clusters of tRNAs.Sharing/Access information.
There were four basic ways we obtained the sequences for this study. First, about 40 of the genomes were present in the NIH Organelle Genome database (https://www.ncbi.nlm.nih.gov/genome/organelle/). These were simply downloaded from the database. Second, the NCBI assembly database (https://www.ncbi.nlm.nih.gov/assembly/) contains contigs that can be assembled into more contiguous sequences and then fully annotated. These contigs were mostly deposited by the authors of several recent large genome studies of drosophilid flies. Contigs containing mtDNA sequences were extracted and partially assembled using MitoZ with final assembly in MitoS. The initial assemblies relied on several methods of assembly and the quality of these draft mtDNA genomes was highly variable. Third, the assembly libraries were reassembled with a single assembly program (SPAdes v3.11.1). Assembly with this program appeared to be more consistent and complete than with other approaches. The fourth approach was to reassemble the SRA reads (https://www.ncbi.nlm.nih.gov/sra) directly to the Drosophila melanogaster reference mtDNA genome using Geneious software with the “map to reference” function; the genomes thus obtained were then annotated using MitoS. The last approach ended up being the most efficient and accurate, so we repeated many of the previously assembled genome assemblies using the Geneious-MitoS approach. A few of the genomes assembled this final way were generated from RNA-Seq data, but the majority were from WGS SRA sequences.